from numerical sensor data to semantic representations1186422/fulltext01.pdf · scribe unseen but...

From Numerical Sensor Data to Semantic Representations: A Data-driven Approach for Generating Linguistic Descriptions




To Sepideh

For her kindness and devotion, and for her endless support.

To Sepideh


To Sepideh


To Sepideh


Örebro Studies in Technology 78

HADI BANAEE



HADI BANAEE



HADI BANAEE



HADI BANAEE


© Hadi Banaee, 2018

Title: From Numerical Sensor Data to Semantic Representations: A Data-driven Approach for Generating Linguistic Descriptions

Publisher: Örebro University 2018 www.publications.oru.se

Print: Örebro University, Repro 03/2018

ISSN 1650-8580 ISBN 978-91-7529-240-3





ISSN 1650-8580 ISBN 978-91-7529-240-3





ISSN 1650-8580 ISBN 978-91-7529-240-3





ISSN 1650-8580 ISBN 978-91-7529-240-3

Abstract Hadi Banaee (2018): From Numerical Sensor Data to Semantic Representations: A Data-driven Approach for Generating Linguistic Descriptions. Örebro Studies in Technology 78.

In our daily lives, sensors recordings are becoming more and more ubiquitous. With the increased availability of data comes the increased need of systems that can represent the data in human interpretable concepts. In order to describe unknown observations in natural language, an artificial intelligence system must deal with several issues involving perception, concept formation, and linguistic description. These issues cover various subfields within artificial intelligence, such as machine learning, cognitive science, and natural language generation.

The aim of this thesis is to address the problem of semantically modelling and describing numerical observations from sensor data. This thesis introduces data-driven approaches to perform the tasks of mining numerical data and creating semantic representations of the derived information in order to de-scribe unseen but interesting observations in natural language.

The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data- riven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (including the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.

The main outcome of this thesis is the use of data-driven strategies that ena-ble the system to reveal and explain aspects of sensor data which may other-wise be difficult to capture by knowledge-driven techniques alone. Briefly put, the thesis aims to automate the process whereby unknown observations of data can be 1) numerically analysed, 2) semantically represented, and eventually 3) linguistically described.

Keywords: Semantic representations, Conceptual spaces, Natural language generation, Temporal rule mining, Physiological sensors, Health monitoring system.

Hadi Banaee, School of Science and Technology Örebro University, SE-701 82 Örebro, Sweden, [email protected]




The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-riven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (including the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.







The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data- riven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (including the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.







The research considers creating a semantic representation using the theory of conceptual spaces. In particular, the central contribution of this thesis is to present a data-riven approach that automatically constructs conceptual spaces from labelled numerical data sets. This constructed conceptual space then utilises semantic inference techniques to derive linguistic interpretations for novel unknown observations. Another contribution of this thesis is to explore an instantiation of the proposed approach in a real-world application. Specifically, this research investigates a case study where the proposed approach is used to describe unknown time series patterns that emerge from physiological sensor data. This instantiation first presents automatic data analysis methods to extract time series patterns and temporal rules from multiple channels of physiological sensor data, and then applies various linguistic description approaches (including the proposed semantic representation based on conceptual spaces) to generate human-readable natural language descriptions for such time series patterns and temporal rules.




Acknowledgements

Conceptual Space of Acknowledgement1:

1There are no linguistic descriptions generated for this conceptual space.

iii

Acknowledgements

ConceptualSpaceofAcknowledgement1:

1Therearenolinguisticdescriptionsgeneratedforthisconceptualspace.

iii

Acknowledgements

Conceptual Space of Acknowledgement1:

1There are no linguistic descriptions generated for this conceptual space.

iii

Acknowledgements

ConceptualSpaceofAcknowledgement1:

1Therearenolinguisticdescriptionsgeneratedforthisconceptualspace.

iii

Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

I Creating Semantic Representations for Numerical Data 17

2 Background and Related Work 192.1 Semantic Representation . . . . . . . . . . . . . . . . . . . . . . 192.2 On the Theory of Conceptual Spaces . . . . . . . . . . . . . . . 20

2.2.1 Identifying Quality Dimensions . . . . . . . . . . . . . . 242.2.2 Related Work on Conceptual Spaces and AI . . . . . . . 25

2.3 Generating Linguistic Descriptions . . . . . . . . . . . . . . . . 272.3.1 Linguistic Descriptions of Data (LDD) . . . . . . . . . . 272.3.2 Natural Language Generation (NLG) . . . . . . . . . . . 29

2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Data-Driven Construction of Conceptual Spaces 353.1 Domain and Quality Dimension Specification . . . . . . . . . . 37

3.1.1 Feature Subset Ranking . . . . . . . . . . . . . . . . . . 393.1.2 Feature Subset Grouping . . . . . . . . . . . . . . . . . . 41

3.2 Concept Representation . . . . . . . . . . . . . . . . . . . . . . 443.2.1 Convex Regions of Concepts . . . . . . . . . . . . . . . 463.2.2 Context-dependent Weights of Concepts . . . . . . . . . 49

3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

v

Contents

1Introduction11.1Motivation.............................21.2ProblemStatement.........................51.3ResearchQuestion.........................61.4Contributions............................71.5ThesisOutline............................91.6Publications.............................12

ICreatingSemanticRepresentationsforNumericalData17

2BackgroundandRelatedWork192.1SemanticRepresentation......................192.2OntheTheoryofConceptualSpaces...............20

2.2.1IdentifyingQualityDimensions..............242.2.2RelatedWorkonConceptualSpacesandAI.......25

2.3GeneratingLinguisticDescriptions................272.3.1LinguisticDescriptionsofData(LDD)..........272.3.2NaturalLanguageGeneration(NLG)...........29

2.4Conclusions.............................33

3Data-DrivenConstructionofConceptualSpaces353.1DomainandQualityDimensionSpecification..........37

3.1.1FeatureSubsetRanking..................393.1.2FeatureSubsetGrouping..................41

3.2ConceptRepresentation......................443.2.1ConvexRegionsofConcepts...............463.2.2Context-dependentWeightsofConcepts.........49

3.3Discussion..............................49

v

Contents

1 Introduction 11.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . 51.3 Research Question . . . . . . . . . . . . . . . . . . . . . . . . . 61.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.5 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.6 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

I Creating Semantic Representations for Numerical Data 17

2 Background and Related Work 192.1 Semantic Representation . . . . . . . . . . . . . . . . . . . . . . 192.2 On the Theory of Conceptual Spaces . . . . . . . . . . . . . . . 20

2.2.1 Identifying Quality Dimensions . . . . . . . . . . . . . . 242.2.2 Related Work on Conceptual Spaces and AI . . . . . . . 25

2.3 Generating Linguistic Descriptions . . . . . . . . . . . . . . . . 272.3.1 Linguistic Descriptions of Data (LDD) . . . . . . . . . . 272.3.2 Natural Language Generation (NLG) . . . . . . . . . . . 29

2.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3 Data-Driven Construction of Conceptual Spaces 353.1 Domain and Quality Dimension Specification . . . . . . . . . . 37

3.1.1 Feature Subset Ranking . . . . . . . . . . . . . . . . . . 393.1.2 Feature Subset Grouping . . . . . . . . . . . . . . . . . . 41

3.2 Concept Representation . . . . . . . . . . . . . . . . . . . . . . 443.2.1 Convex Regions of Concepts . . . . . . . . . . . . . . . 463.2.2 Context-dependent Weights of Concepts . . . . . . . . . 49

3.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

v

Contents

1Introduction11.1Motivation.............................21.2ProblemStatement.........................51.3ResearchQuestion.........................61.4Contributions............................71.5ThesisOutline............................91.6Publications.............................12

ICreatingSemanticRepresentationsforNumericalData17

2BackgroundandRelatedWork192.1SemanticRepresentation......................192.2OntheTheoryofConceptualSpaces...............20

2.2.1IdentifyingQualityDimensions..............242.2.2RelatedWorkonConceptualSpacesandAI.......25

2.3GeneratingLinguisticDescriptions................272.3.1LinguisticDescriptionsofData(LDD)..........272.3.2NaturalLanguageGeneration(NLG)...........29

2.4Conclusions.............................33

3Data-DrivenConstructionofConceptualSpaces353.1DomainandQualityDimensionSpecification..........37

3.1.1FeatureSubsetRanking..................393.1.2FeatureSubsetGrouping..................41

3.2ConceptRepresentation......................443.2.1ConvexRegionsofConcepts...............463.2.2Context-dependentWeightsofConcepts.........49

3.3Discussion..............................49

v

vi CONTENTS

4 Semantic Inference in Conceptual Spaces 534.1 Symbol Space Definition . . . . . . . . . . . . . . . . . . . . . . 544.2 Inferring Linguistic Descriptions . . . . . . . . . . . . . . . . . . 56

4.2.1 Phase A: Inference in Conceptual Space . . . . . . . . . . 574.2.2 Phase B: Inference in Symbol Space . . . . . . . . . . . . 65

4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 Results and Evaluation: A Case Study on Leaf Data Set 735.1 Constructing a Conceptual Space of Leaves . . . . . . . . . . . . 74

5.1.1 Domain Specification for Leaf Data set . . . . . . . . . . 755.1.2 Concept Representation for Leaf Concepts . . . . . . . . 77

5.2 Semantic Inference for Unknown Leaf Samples . . . . . . . . . . 775.2.1 Inference in Conceptual Space of Leaves . . . . . . . . . 785.2.2 Inference in Symbol Space of Leaves . . . . . . . . . . . 78

5.3 Empirical Evaluation for Leaf Samples . . . . . . . . . . . . . . 785.3.1 Survey: Design and Procedure . . . . . . . . . . . . . . . 805.3.2 Identifying Leaf Observations from Linguistic Descriptions 825.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

II Physiological Sensor Data: From Data Analysis to Linguis-tic Descriptions 89

6 An Overview of Health Monitoring with Mining Physiological SensorData 916.1 Data Mining Tasks in Health Monitoring Systems . . . . . . . . 926.2 Data Mining in Health Monitoring Systems . . . . . . . . . . . 95

6.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 966.2.2 Feature Extraction/Selection . . . . . . . . . . . . . . . . 966.2.3 Modelling and Learning Methods . . . . . . . . . . . . . 97

6.3 Data Sets: Acquisition and Properties . . . . . . . . . . . . . . . 996.3.1 Sensor Data Acquisition . . . . . . . . . . . . . . . . . . 996.3.2 Sensor Data Properties . . . . . . . . . . . . . . . . . . . 100

6.4 Discussion and Challenges . . . . . . . . . . . . . . . . . . . . . 101

7 Physiological Time Series Data: Preparation and Processing 1057.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106

7.1.1 Wearable Sensors, Non-clinical Data . . . . . . . . . . . 1067.1.2 Clinical Physiological Data . . . . . . . . . . . . . . . . . 107

7.2 Trend Detection in Physiological Time Series . . . . . . . . . . . 1097.3 Pattern Abstraction in Physiological Data . . . . . . . . . . . . . 113

7.3.1 Background on Pattern Abstraction . . . . . . . . . . . . 1137.3.2 Prototypical Pattern Abstraction . . . . . . . . . . . . . . 115

viCONTENTS

4SemanticInferenceinConceptualSpaces534.1SymbolSpaceDefinition......................544.2InferringLinguisticDescriptions..................56

4.2.1PhaseA:InferenceinConceptualSpace..........574.2.2PhaseB:InferenceinSymbolSpace............65

4.3Discussion..............................70

5ResultsandEvaluation:ACaseStudyonLeafDataSet735.1ConstructingaConceptualSpaceofLeaves............74

5.1.1DomainSpecificationforLeafDataset..........755.1.2ConceptRepresentationforLeafConcepts........77

5.2SemanticInferenceforUnknownLeafSamples..........775.2.1InferenceinConceptualSpaceofLeaves.........785.2.2InferenceinSymbolSpaceofLeaves...........78

5.3EmpiricalEvaluationforLeafSamples..............785.3.1Survey:DesignandProcedure...............805.3.2IdentifyingLeafObservationsfromLinguisticDescriptions825.3.3RatingLinguisticDescriptionsofLeafObservations...83

5.4Discussion..............................85

IIPhysiologicalSensorData:FromDataAnalysistoLinguis-ticDescriptions89

6AnOverviewofHealthMonitoringwithMiningPhysiologicalSensorData916.1DataMiningTasksinHealthMonitoringSystems........926.2DataMininginHealthMonitoringSystems...........95

6.2.1Preprocessing........................966.2.2FeatureExtraction/Selection................966.2.3ModellingandLearningMethods.............97

6.3DataSets:AcquisitionandProperties...............996.3.1SensorDataAcquisition..................996.3.2SensorDataProperties...................100

6.4DiscussionandChallenges.....................101

7PhysiologicalTimeSeriesData:PreparationandProcessing1057.1InputTimeSeriesSensorData:CollectionandAcquisition...106

7.1.1WearableSensors,Non-clinicalData...........1067.1.2ClinicalPhysiologicalData.................107

7.2TrendDetectioninPhysiologicalTimeSeries...........1097.3PatternAbstractioninPhysiologicalData.............113

7.3.1BackgroundonPatternAbstraction............1137.3.2PrototypicalPatternAbstraction..............115

vi CONTENTS

4 Semantic Inference in Conceptual Spaces 534.1 Symbol Space Definition . . . . . . . . . . . . . . . . . . . . . . 544.2 Inferring Linguistic Descriptions . . . . . . . . . . . . . . . . . . 56

4.2.1 Phase A: Inference in Conceptual Space . . . . . . . . . . 574.2.2 Phase B: Inference in Symbol Space . . . . . . . . . . . . 65

4.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

5 Results and Evaluation: A Case Study on Leaf Data Set 735.1 Constructing a Conceptual Space of Leaves . . . . . . . . . . . . 74

5.1.1 Domain Specification for Leaf Data set . . . . . . . . . . 755.1.2 Concept Representation for Leaf Concepts . . . . . . . . 77

5.2 Semantic Inference for Unknown Leaf Samples . . . . . . . . . . 775.2.1 Inference in Conceptual Space of Leaves . . . . . . . . . 785.2.2 Inference in Symbol Space of Leaves . . . . . . . . . . . 78

5.3 Empirical Evaluation for Leaf Samples . . . . . . . . . . . . . . 785.3.1 Survey: Design and Procedure . . . . . . . . . . . . . . . 805.3.2 Identifying Leaf Observations from Linguistic Descriptions 825.3.3 Rating Linguistic Descriptions of Leaf Observations . . . 83

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

II Physiological Sensor Data: From Data Analysis to Linguis-tic Descriptions 89

6 An Overview of Health Monitoring with Mining Physiological SensorData 916.1 Data Mining Tasks in Health Monitoring Systems . . . . . . . . 926.2 Data Mining in Health Monitoring Systems . . . . . . . . . . . 95

6.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . 966.2.2 Feature Extraction/Selection . . . . . . . . . . . . . . . . 966.2.3 Modelling and Learning Methods . . . . . . . . . . . . . 97

6.3 Data Sets: Acquisition and Properties . . . . . . . . . . . . . . . 996.3.1 Sensor Data Acquisition . . . . . . . . . . . . . . . . . . 996.3.2 Sensor Data Properties . . . . . . . . . . . . . . . . . . . 100

6.4 Discussion and Challenges . . . . . . . . . . . . . . . . . . . . . 101

7 Physiological Time Series Data: Preparation and Processing 1057.1 Input Time Series Sensor Data: Collection and Acquisition . . . 106

7.1.1 Wearable Sensors, Non-clinical Data . . . . . . . . . . . 1067.1.2 Clinical Physiological Data . . . . . . . . . . . . . . . . . 107

7.2 Trend Detection in Physiological Time Series . . . . . . . . . . . 1097.3 Pattern Abstraction in Physiological Data . . . . . . . . . . . . . 113

7.3.1 Background on Pattern Abstraction . . . . . . . . . . . . 1137.3.2 Prototypical Pattern Abstraction . . . . . . . . . . . . . . 115

viCONTENTS

4SemanticInferenceinConceptualSpaces534.1SymbolSpaceDefinition......................544.2InferringLinguisticDescriptions..................56

4.2.1PhaseA:InferenceinConceptualSpace..........574.2.2PhaseB:InferenceinSymbolSpace............65

4.3Discussion..............................70

5ResultsandEvaluation:ACaseStudyonLeafDataSet735.1ConstructingaConceptualSpaceofLeaves............74

5.1.1DomainSpecificationforLeafDataset..........755.1.2ConceptRepresentationforLeafConcepts........77

5.2SemanticInferenceforUnknownLeafSamples..........775.2.1InferenceinConceptualSpaceofLeaves.........785.2.2InferenceinSymbolSpaceofLeaves...........78

5.3EmpiricalEvaluationforLeafSamples..............785.3.1Survey:DesignandProcedure...............805.3.2IdentifyingLeafObservationsfromLinguisticDescriptions825.3.3RatingLinguisticDescriptionsofLeafObservations...83

5.4Discussion..............................85

IIPhysiologicalSensorData:FromDataAnalysistoLinguis-ticDescriptions89

6AnOverviewofHealthMonitoringwithMiningPhysiologicalSensorData916.1DataMiningTasksinHealthMonitoringSystems........926.2DataMininginHealthMonitoringSystems...........95

6.2.1Preprocessing........................966.2.2FeatureExtraction/Selection................966.2.3ModellingandLearningMethods.............97

6.3DataSets:AcquisitionandProperties...............996.3.1SensorDataAcquisition..................996.3.2SensorDataProperties...................100

6.4DiscussionandChallenges.....................101

7PhysiologicalTimeSeriesData:PreparationandProcessing1057.1InputTimeSeriesSensorData:CollectionandAcquisition...106

7.1.1WearableSensors,Non-clinicalData...........1067.1.2ClinicalPhysiologicalData.................107

7.2TrendDetectioninPhysiologicalTimeSeries...........1097.3PatternAbstractioninPhysiologicalData.............113

7.3.1BackgroundonPatternAbstraction............1137.3.2PrototypicalPatternAbstraction..............115

CONTENTS vii

7.4 Discussion and Summary . . . . . . . . . . . . . . . . . . . . . . 117

8 Mining and Describing Physiological Time Series Data 1198.1 Mining Temporal Rules in Physiological Sensor Data . . . . . . 120

8.1.1 Background on Temporal Rule Mining . . . . . . . . . . 1208.1.2 A New Approach for Temporal Rule Mining . . . . . . . 1228.1.3 Temporal Rule Set Similarity . . . . . . . . . . . . . . . . 1248.1.4 Results: Distinctive Rules in Clinical Settings . . . . . . . 1268.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128

8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . . . 1318.2.1 Trend and Pattern Description . . . . . . . . . . . . . . . 1328.2.2 Temporal Rule Representation . . . . . . . . . . . . . . . 133


9 Linguistic Descriptions for Patterns using Conceptual Spaces 1399.1 Constructing Conceptual Space of Patterns . . . . . . . . . . . . 140

9.1.1 Domain Specification for Time Series Pattern Data Set . . 1419.1.2 Concept Representation for Pattern Concepts . . . . . . 143

9.2 Semantic Inference for Unknown Patterns . . . . . . . . . . . . 1449.2.1 Inference in Conceptual Space of Patterns . . . . . . . . 1449.2.2 Inference in Symbol Space of Patterns . . . . . . . . . . . 145

9.3 Evaluation: Descriptions from Conceptual Spaces . . . . . . . . 1479.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 1479.3.2 Identifying Pattern Observations from Linguistic Descrip-

tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1479.3.3 Rating Linguistic Descriptions of Pattern Observations . 149


10 Conclusions 15310.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . 153

10.1.1 Construction of Conceptual Spaces (C1) . . . . . . . . . 15310.1.2 Semantic Inference in Conceptual Spaces (C2) . . . . . . 15410.1.3 Mining Physiological Sensor Data (C3) . . . . . . . . . . 15510.1.4 Linguistic Description by Semantic Representations (C4) 155

10.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15610.3 Societal and Ethical Impacts . . . . . . . . . . . . . . . . . . . . 15810.4 Future Research Directions . . . . . . . . . . . . . . . . . . . . . 15910.5 Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

References 165

CONTENTSvii

7.4DiscussionandSummary......................117

8MiningandDescribingPhysiologicalTimeSeriesData1198.1MiningTemporalRulesinPhysiologicalSensorData......120

8.1.1BackgroundonTemporalRuleMining..........1208.1.2ANewApproachforTemporalRuleMining.......1228.1.3TemporalRuleSetSimilarity................1248.1.4Results:DistinctiveRulesinClinicalSettings.......1268.1.5EvaluationofRuleSetSimilarityinClinicalConditions.128

8.2LinguisticDescriptionsforPatternsandTemporalRules.....1318.2.1TrendandPatternDescription...............1328.2.2TemporalRuleRepresentation...............133


9LinguisticDescriptionsforPatternsusingConceptualSpaces1399.1ConstructingConceptualSpaceofPatterns............140

9.1.1DomainSpecificationforTimeSeriesPatternDataSet..1419.1.2ConceptRepresentationforPatternConcepts......143

9.2SemanticInferenceforUnknownPatterns............1449.2.1InferenceinConceptualSpaceofPatterns........1449.2.2InferenceinSymbolSpaceofPatterns...........145

9.3Evaluation:DescriptionsfromConceptualSpaces........1479.3.1Survey:DesignandProcedureforPatternDataSet....1479.3.2IdentifyingPatternObservationsfromLinguisticDescrip-

tions.............................1479.3.3RatingLinguisticDescriptionsofPatternObservations.149


10Conclusions15310.1SummaryofContributions.....................153

10.1.1ConstructionofConceptualSpaces(C1).........15310.1.2SemanticInferenceinConceptualSpaces(C2)......15410.1.3MiningPhysiologicalSensorData(C3)..........15510.1.4LinguisticDescriptionbySemanticRepresentations(C4)155

10.2Limitations.............................15610.3SocietalandEthicalImpacts....................15810.4FutureResearchDirections.....................15910.5FinalWords.............................163

References165

CONTENTS vii


8 Mining and Describing Physiological Time Series Data 1198.1 Mining Temporal Rules in Physiological Sensor Data . . . . . . 120

8.1.1 Background on Temporal Rule Mining . . . . . . . . . . 1208.1.2 A New Approach for Temporal Rule Mining . . . . . . . 1228.1.3 Temporal Rule Set Similarity . . . . . . . . . . . . . . . . 1248.1.4 Results: Distinctive Rules in Clinical Settings . . . . . . . 1268.1.5 Evaluation of Rule Set Similarity in Clinical Conditions . 128

8.2 Linguistic Descriptions for Patterns and Temporal Rules . . . . . 1318.2.1 Trend and Pattern Description . . . . . . . . . . . . . . . 1328.2.2 Temporal Rule Representation . . . . . . . . . . . . . . . 133


9 Linguistic Descriptions for Patterns using Conceptual Spaces 1399.1 Constructing Conceptual Space of Patterns . . . . . . . . . . . . 140

9.1.1 Domain Specification for Time Series Pattern Data Set . . 1419.1.2 Concept Representation for Pattern Concepts . . . . . . 143

9.2 Semantic Inference for Unknown Patterns . . . . . . . . . . . . 1449.2.1 Inference in Conceptual Space of Patterns . . . . . . . . 1449.2.2 Inference in Symbol Space of Patterns . . . . . . . . . . . 145

9.3 Evaluation: Descriptions from Conceptual Spaces . . . . . . . . 1479.3.1 Survey: Design and Procedure for Pattern Data Set . . . . 1479.3.2 Identifying Pattern Observations from Linguistic Descrip-

tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1479.3.3 Rating Linguistic Descriptions of Pattern Observations . 149


10 Conclusions 15310.1 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . 153

10.1.1 Construction of Conceptual Spaces (C1) . . . . . . . . . 15310.1.2 Semantic Inference in Conceptual Spaces (C2) . . . . . . 15410.1.3 Mining Physiological Sensor Data (C3) . . . . . . . . . . 15510.1.4 Linguistic Description by Semantic Representations (C4) 155

10.2 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15610.3 Societal and Ethical Impacts . . . . . . . . . . . . . . . . . . . . 15810.4 Future Research Directions . . . . . . . . . . . . . . . . . . . . . 15910.5 Final Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

References 165

CONTENTSvii


8MiningandDescribingPhysiologicalTimeSeriesData1198.1MiningTemporalRulesinPhysiologicalSensorData......120

8.1.1BackgroundonTemporalRuleMining..........1208.1.2ANewApproachforTemporalRuleMining.......1228.1.3TemporalRuleSetSimilarity................1248.1.4Results:DistinctiveRulesinClinicalSettings.......1268.1.5EvaluationofRuleSetSimilarityinClinicalConditions.128

8.2LinguisticDescriptionsforPatternsandTemporalRules.....1318.2.1TrendandPatternDescription...............1328.2.2TemporalRuleRepresentation...............133


9LinguisticDescriptionsforPatternsusingConceptualSpaces1399.1ConstructingConceptualSpaceofPatterns............140

9.1.1DomainSpecificationforTimeSeriesPatternDataSet..1419.1.2ConceptRepresentationforPatternConcepts......143

9.2SemanticInferenceforUnknownPatterns............1449.2.1InferenceinConceptualSpaceofPatterns........1449.2.2InferenceinSymbolSpaceofPatterns...........145

9.3Evaluation:DescriptionsfromConceptualSpaces........1479.3.1Survey:DesignandProcedureforPatternDataSet....1479.3.2IdentifyingPatternObservationsfromLinguisticDescrip-

tions.............................1479.3.3RatingLinguisticDescriptionsofPatternObservations.149


10Conclusions15310.1SummaryofContributions.....................153

10.1.1ConstructionofConceptualSpaces(C1).........15310.1.2SemanticInferenceinConceptualSpaces(C2)......15410.1.3MiningPhysiologicalSensorData(C3)..........15510.1.4LinguisticDescriptionbySemanticRepresentations(C4)155

10.2Limitations.............................15610.3SocietalandEthicalImpacts....................15810.4FutureResearchDirections.....................15910.5FinalWords.............................163

References165

List of Figures

1.1 A part of the Rumi’s poem in Persian, together with an illustra-tion of the story, adapted from [7]. . . . . . . . . . . . . . . . . 1

1.2 A schematic overview of the tasks to be performed in both the-oretical (inner box) and application (outer box) focuses of thethesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Thesis MindMap, illustrating the appearance of the researchtasks in the chapters, together with the contributions of this thesis. 14

2.1 A schematic presentation of a conceptual space of fruits. . . . . 242.2 The architecture of data-to-text systems, proposed by Reiter. . . 31

3.1 Illustration of the main steps for constructing a conceptual spacefrom a set of numeric data. . . . . . . . . . . . . . . . . . . . . . 37

3.2 Two phases of the domain and quality dimension specification,with input and output parameters of each phase. . . . . . . . . . 39

3.3 A weighted bipartite graph with two sets of vertices from thelabels Y and the selected features F ′. . . . . . . . . . . . . . . . 43

3.4 A bigraph graph and one selected biclique (blue edges) for theleaf example (explained in Example 3.5). . . . . . . . . . . . . . 45

3.5 A concept representation example in a conceptual space withdomains δa and δb. . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 Illustration of the steps of inferring linguistic descriptions for anunknown observation via the constructed conceptual spaces andits corresponding symbol space. . . . . . . . . . . . . . . . . . . 54

4.2 Schematic of a conceptual space and the coupled symbol space. . 554.3 Two phases of the semantic inference for generating linguistic

descriptions, with the input and output parameters of each phase. 574.4 An illustration of four different cases with respect to the various

positions of an instance points within domains. . . . . . . . . . 60

ix

ListofFigures

1.1ApartoftheRumi’spoeminPersian,togetherwithanillustra-tionofthestory,adaptedfrom[7].................1

1.2Aschematicoverviewofthetaskstobeperformedinboththe-oretical(innerbox)andapplication(outerbox)focusesofthethesis.................................6

1.3ThesisMindMap,illustratingtheappearanceoftheresearchtasksinthechapters,togetherwiththecontributionsofthisthesis.14

2.1Aschematicpresentationofaconceptualspaceoffruits.....242.2Thearchitectureofdata-to-textsystems,proposedbyReiter...31

3.1Illustrationofthemainstepsforconstructingaconceptualspacefromasetofnumericdata......................37

3.2Twophasesofthedomainandqualitydimensionspecification,withinputandoutputparametersofeachphase..........39

3.3AweightedbipartitegraphwithtwosetsofverticesfromthelabelsYandtheselectedfeaturesF′................43

3.4Abigraphgraphandoneselectedbiclique(blueedges)fortheleafexample(explainedinExample3.5)..............45

3.5Aconceptrepresentationexampleinaconceptualspacewithdomainsδaandδb..........................48

4.1Illustrationofthestepsofinferringlinguisticdescriptionsforanunknownobservationviatheconstructedconceptualspacesanditscorrespondingsymbolspace...................54

4.2Schematicofaconceptualspaceandthecoupledsymbolspace..554.3Twophasesofthesemanticinferenceforgeneratinglinguistic

descriptions,withtheinputandoutputparametersofeachphase.574.4Anillustrationoffourdifferentcaseswithrespecttothevarious

positionsofaninstancepointswithindomains..........60

ix

List of Figures

1.1 A part of the Rumi’s poem in Persian, together with an illustra-tion of the story, adapted from [7]. . . . . . . . . . . . . . . . . 1

1.2 A schematic overview of the tasks to be performed in both the-oretical (inner box) and application (outer box) focuses of thethesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Thesis MindMap, illustrating the appearance of the researchtasks in the chapters, together with the contributions of this thesis. 14

2.1 A schematic presentation of a conceptual space of fruits. . . . . 242.2 The architecture of data-to-text systems, proposed by Reiter. . . 31

3.1 Illustration of the main steps for constructing a conceptual spacefrom a set of numeric data. . . . . . . . . . . . . . . . . . . . . . 37

3.2 Two phases of the domain and quality dimension specification,with input and output parameters of each phase. . . . . . . . . . 39

3.3 A weighted bipartite graph with two sets of vertices from thelabels Y and the selected features F ′. . . . . . . . . . . . . . . . 43

3.4 A bigraph graph and one selected biclique (blue edges) for theleaf example (explained in Example 3.5). . . . . . . . . . . . . . 45

3.5 A concept representation example in a conceptual space withdomains δa and δb. . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.1 Illustration of the steps of inferring linguistic descriptions for anunknown observation via the constructed conceptual spaces andits corresponding symbol space. . . . . . . . . . . . . . . . . . . 54

4.2 Schematic of a conceptual space and the coupled symbol space. . 554.3 Two phases of the semantic inference for generating linguistic

descriptions, with the input and output parameters of each phase. 574.4 An illustration of four different cases with respect to the various

positions of an instance points within domains. . . . . . . . . . 60

ix

ListofFigures

1.1ApartoftheRumi’spoeminPersian,togetherwithanillustra-tionofthestory,adaptedfrom[7].................1

1.2Aschematicoverviewofthetaskstobeperformedinboththe-oretical(innerbox)andapplication(outerbox)focusesofthethesis.................................6

1.3ThesisMindMap,illustratingtheappearanceoftheresearchtasksinthechapters,togetherwiththecontributionsofthisthesis.14

2.1Aschematicpresentationofaconceptualspaceoffruits.....242.2Thearchitectureofdata-to-textsystems,proposedbyReiter...31

3.1Illustrationofthemainstepsforconstructingaconceptualspacefromasetofnumericdata......................37

3.2Twophasesofthedomainandqualitydimensionspecification,withinputandoutputparametersofeachphase..........39

3.3AweightedbipartitegraphwithtwosetsofverticesfromthelabelsYandtheselectedfeaturesF′................43

3.4Abigraphgraphandoneselectedbiclique(blueedges)fortheleafexample(explainedinExample3.5)..............45

3.5Aconceptrepresentationexampleinaconceptualspacewithdomainsδaandδb..........................48

4.1Illustrationofthestepsofinferringlinguisticdescriptionsforanunknownobservationviatheconstructedconceptualspacesanditscorrespondingsymbolspace...................54

4.2Schematicofaconceptualspaceandthecoupledsymbolspace..554.3Twophasesofthesemanticinferenceforgeneratinglinguistic

descriptions,withtheinputandoutputparametersofeachphase.574.4Anillustrationoffourdifferentcaseswithrespecttothevarious

positionsofaninstancepointswithindomains..........60

ix

x LIST OF FIGURES

4.5 Example of the membership functions of the linguistic terms cir-cular, elliptical, and elongated, describing the elongation qualitydimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.6 Annotation and characterisation messages in AVM format . . . 69

5.1 Six species as the known leaves in the leaf data set. . . . . . . . 745.2 The bipartite graph presenting the relevance of features and la-

bels in leaf data set. . . . . . . . . . . . . . . . . . . . . . . . . . 755.3 The conceptual space of leaf data set. . . . . . . . . . . . . . . . 765.4 A set of unknown leaf samples. . . . . . . . . . . . . . . . . . . 795.5 Screenshots from two types of questions designed for the survey. 815.6 The description of leaf (a) has been shown 31 times to the par-

ticipants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.7 The box plot of the rating scores received for each of the models,

deriving descriptions in leaf data set. . . . . . . . . . . . . . . . 84

6.1 A schematic overview of the position of the main data miningtasks concerning the different aspects of the health monitoringsystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.2 A generic architecture of the primary data mining approach forwearable sensor data. . . . . . . . . . . . . . . . . . . . . . . . . 96

7.1 The wearable sensor, Bioharness3 [5], worn on the chest is ableto locally store the measured data or wirelessly transmit it viaBluetooth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.2 An example of non-clinical measurements depicting thirteen hoursof heart rate (HR, top) and respiration rate (RR, down) duringsequential activities. . . . . . . . . . . . . . . . . . . . . . . . . 107

7.3 An example of clinical sensor data from MIMIC data set withvariables heart rate (HR), blood pressure (BP), and respirationrate (RR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.4 An example of physiological time series data preprocessing andsegmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.5 The output of partial trend detection algorithm for two seg-mented time series (HR on top and RR on bottom). . . . . . . . 114

7.6 An example of trend detection outputs for two different resolu-tions of heart rate data. . . . . . . . . . . . . . . . . . . . . . . . 115

7.7 An example of prototypical patterns for HR data in CHF condition. 117

8.1 Three temporal relations between one pattern P1 in P(tHR) andat most two patterns P2 in P(tBP). . . . . . . . . . . . . . . . . . 123

8.2 Result of cross-validation approach on a selected iteration forMI condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

xLISTOFFIGURES

4.5Exampleofthemembershipfunctionsofthelinguistictermscir-cular,elliptical,andelongated,describingtheelongationqualitydimension...............................65

4.6AnnotationandcharacterisationmessagesinAVMformat...69

5.1Sixspeciesastheknownleavesintheleafdataset........745.2Thebipartitegraphpresentingtherelevanceoffeaturesandla-

belsinleafdataset..........................755.3Theconceptualspaceofleafdataset................765.4Asetofunknownleafsamples...................795.5Screenshotsfromtwotypesofquestionsdesignedforthesurvey.815.6Thedescriptionofleaf(a)hasbeenshown31timestothepar-

ticipants................................825.7Theboxplotoftheratingscoresreceivedforeachofthemodels,

derivingdescriptionsinleafdataset................84

6.1Aschematicoverviewofthepositionofthemaindataminingtasksconcerningthedifferentaspectsofthehealthmonitoringsystems................................94

6.2Agenericarchitectureoftheprimarydataminingapproachforwearablesensordata.........................96

7.1Thewearablesensor,Bioharness3[5],wornonthechestisabletolocallystorethemeasureddataorwirelesslytransmititviaBluetooth...............................106

7.2Anexampleofnon-clinicalmeasurementsdepictingthirteenhoursofheartrate(HR,top)andrespirationrate(RR,down)duringsequentialactivities.........................107

7.3AnexampleofclinicalsensordatafromMIMICdatasetwithvariablesheartrate(HR),bloodpressure(BP),andrespirationrate(RR)...............................109

7.4Anexampleofphysiologicaltimeseriesdatapreprocessingandsegmentation.............................111

7.5Theoutputofpartialtrenddetectionalgorithmfortwoseg-mentedtimeseries(HRontopandRRonbottom)........114

7.6Anexampleoftrenddetectionoutputsfortwodifferentresolu-tionsofheartratedata........................115

7.7AnexampleofprototypicalpatternsforHRdatainCHFcondition.117

8.1ThreetemporalrelationsbetweenonepatternP1inP(tHR)andatmosttwopatternsP2inP(tBP)..................123

8.2Resultofcross-validationapproachonaselectediterationforMIcondition.............................127

x LIST OF FIGURES

4.5 Example of the membership functions of the linguistic terms cir-cular, elliptical, and elongated, describing the elongation qualitydimension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.6 Annotation and characterisation messages in AVM format . . . 69

5.1 Six species as the known leaves in the leaf data set. . . . . . . . 745.2 The bipartite graph presenting the relevance of features and la-

bels in leaf data set. . . . . . . . . . . . . . . . . . . . . . . . . . 755.3 The conceptual space of leaf data set. . . . . . . . . . . . . . . . 765.4 A set of unknown leaf samples. . . . . . . . . . . . . . . . . . . 795.5 Screenshots from two types of questions designed for the survey. 815.6 The description of leaf (a) has been shown 31 times to the par-

ticipants. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 825.7 The box plot of the rating scores received for each of the models,

deriving descriptions in leaf data set. . . . . . . . . . . . . . . . 84

6.1 A schematic overview of the position of the main data miningtasks concerning the different aspects of the health monitoringsystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.2 A generic architecture of the primary data mining approach forwearable sensor data. . . . . . . . . . . . . . . . . . . . . . . . . 96

7.1 The wearable sensor, Bioharness3 [5], worn on the chest is ableto locally store the measured data or wirelessly transmit it viaBluetooth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

7.2 An example of non-clinical measurements depicting thirteen hoursof heart rate (HR, top) and respiration rate (RR, down) duringsequential activities. . . . . . . . . . . . . . . . . . . . . . . . . 107

7.3 An example of clinical sensor data from MIMIC data set withvariables heart rate (HR), blood pressure (BP), and respirationrate (RR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

7.4 An example of physiological time series data preprocessing andsegmentation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.5 The output of partial trend detection algorithm for two seg-mented time series (HR on top and RR on bottom). . . . . . . . 114

7.6 An example of trend detection outputs for two different resolu-tions of heart rate data. . . . . . . . . . . . . . . . . . . . . . . . 115

7.7 An example of prototypical patterns for HR data in CHF condition. 117

8.1 Three temporal relations between one pattern P1 in P(tHR) andat most two patterns P2 in P(tBP). . . . . . . . . . . . . . . . . . 123

8.2 Result of cross-validation approach on a selected iteration forMI condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

xLISTOFFIGURES

4.5Exampleofthemembershipfunctionsofthelinguistictermscir-cular,elliptical,andelongated,describingtheelongationqualitydimension...............................65

4.6AnnotationandcharacterisationmessagesinAVMformat...69

5.1Sixspeciesastheknownleavesintheleafdataset........745.2Thebipartitegraphpresentingtherelevanceoffeaturesandla-

belsinleafdataset..........................755.3Theconceptualspaceofleafdataset................765.4Asetofunknownleafsamples...................795.5Screenshotsfromtwotypesofquestionsdesignedforthesurvey.815.6Thedescriptionofleaf(a)hasbeenshown31timestothepar-

ticipants................................825.7Theboxplotoftheratingscoresreceivedforeachofthemodels,

derivingdescriptionsinleafdataset................84

6.1Aschematicoverviewofthepositionofthemaindataminingtasksconcerningthedifferentaspectsofthehealthmonitoringsystems................................94

6.2Agenericarchitectureoftheprimarydataminingapproachforwearablesensordata.........................96

7.1Thewearablesensor,Bioharness3[5],wornonthechestisabletolocallystorethemeasureddataorwirelesslytransmititviaBluetooth...............................106

7.2Anexampleofnon-clinicalmeasurementsdepictingthirteenhoursofheartrate(HR,top)andrespirationrate(RR,down)duringsequentialactivities.........................107

7.3AnexampleofclinicalsensordatafromMIMICdatasetwithvariablesheartrate(HR),bloodpressure(BP),andrespirationrate(RR)...............................109

7.4Anexampleofphysiologicaltimeseriesdatapreprocessingandsegmentation.............................111

7.5Theoutputofpartialtrenddetectionalgorithmfortwoseg-mentedtimeseries(HRontopandRRonbottom)........114

7.6Anexampleoftrenddetectionoutputsfortwodifferentresolu-tionsofheartratedata........................115

7.7AnexampleofprototypicalpatternsforHRdatainCHFcondition.117

8.1ThreetemporalrelationsbetweenonepatternP1inP(tHR)andatmosttwopatternsP2inP(tBP)..................123

8.2Resultofcross-validationapproachonaselectediterationforMIcondition.............................127

LIST OF FIGURES xi

8.3 The number of rules for clinical conditions in TRM-ρ3 and TRM-ρ1 methods, in relation to the multivariate time series. . . . . . . 128

8.4 A selection of distinct temporal rules generated from physiolog-ical data in clinical conditions using TRM-ρ3 approach. . . . . . 129

8.5 Boxplot diagram of the occurrence ratios between one clinicalcondition’s rule set and the other conditions. . . . . . . . . . . . 131

8.6 The output of the partial trends for two segmented time series(HR on top and RR on bottom), shown in Figure 7.5. . . . . . 134

9.1 Four sets of time series patterns, presenting the known classes ofpatterns in the data set. . . . . . . . . . . . . . . . . . . . . . . . 141

9.2 The bipartite graph presenting the relevance of the features andthe labels in data set of time series patterns. . . . . . . . . . . . 142

9.3 The conceptual space of time series pattern data set: a graphicalpresentation of the determined domains with the correspondingquality dimensions and concepts. . . . . . . . . . . . . . . . . . 143

9.4 A set of unknown samples of time series patterns. . . . . . . . . 1449.5 Pie charts, showing the expertise level of the participants by mea-

suring how familiar they are with the introduced terminology forclass labels and features. . . . . . . . . . . . . . . . . . . . . . . 148

9.6 The box plot of the rating scores received for each of the modelsderiving descriptions in pattern data set. . . . . . . . . . . . . . 149

10.1 Human-specified semantic features for animal domain. . . . . . 16010.2 Yet another illustration for the story of The Elephant in the

Dark, adapted from [194]. . . . . . . . . . . . . . . . . . . . . . 163

LISTOFFIGURESxi

8.3ThenumberofrulesforclinicalconditionsinTRM-ρ3andTRM-ρ1methods,inrelationtothemultivariatetimeseries.......128

8.4Aselectionofdistincttemporalrulesgeneratedfromphysiolog-icaldatainclinicalconditionsusingTRM-ρ3approach......129

8.5Boxplotdiagramoftheoccurrenceratiosbetweenoneclinicalcondition’srulesetandtheotherconditions............131

8.6Theoutputofthepartialtrendsfortwosegmentedtimeseries(HRontopandRRonbottom),showninFigure7.5......134

9.1Foursetsoftimeseriespatterns,presentingtheknownclassesofpatternsinthedataset........................141

9.2Thebipartitegraphpresentingtherelevanceofthefeaturesandthelabelsindatasetoftimeseriespatterns............142

9.3Theconceptualspaceoftimeseriespatterndataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts..................143

9.4Asetofunknownsamplesoftimeseriespatterns.........1449.5Piecharts,showingtheexpertiseleveloftheparticipantsbymea-

suringhowfamiliartheyarewiththeintroducedterminologyforclasslabelsandfeatures.......................148

9.6Theboxplotoftheratingscoresreceivedforeachofthemodelsderivingdescriptionsinpatterndataset..............149

10.1Human-specifiedsemanticfeaturesforanimaldomain......16010.2YetanotherillustrationforthestoryofTheElephantinthe

Dark,adaptedfrom[194]......................163

LIST OF FIGURES xi

8.3 The number of rules for clinical conditions in TRM-ρ3 and TRM-ρ1 methods, in relation to the multivariate time series. . . . . . . 128

8.4 A selection of distinct temporal rules generated from physiolog-ical data in clinical conditions using TRM-ρ3 approach. . . . . . 129

8.5 Boxplot diagram of the occurrence ratios between one clinicalcondition’s rule set and the other conditions. . . . . . . . . . . . 131

8.6 The output of the partial trends for two segmented time series(HR on top and RR on bottom), shown in Figure 7.5. . . . . . 134

9.1 Four sets of time series patterns, presenting the known classes ofpatterns in the data set. . . . . . . . . . . . . . . . . . . . . . . . 141

9.2 The bipartite graph presenting the relevance of the features andthe labels in data set of time series patterns. . . . . . . . . . . . 142

9.3 The conceptual space of time series pattern data set: a graphicalpresentation of the determined domains with the correspondingquality dimensions and concepts. . . . . . . . . . . . . . . . . . 143

9.4 A set of unknown samples of time series patterns. . . . . . . . . 1449.5 Pie charts, showing the expertise level of the participants by mea-

suring how familiar they are with the introduced terminology forclass labels and features. . . . . . . . . . . . . . . . . . . . . . . 148

9.6 The box plot of the rating scores received for each of the modelsderiving descriptions in pattern data set. . . . . . . . . . . . . . 149

10.1 Human-specified semantic features for animal domain. . . . . . 16010.2 Yet another illustration for the story of The Elephant in the

Dark, adapted from [194]. . . . . . . . . . . . . . . . . . . . . . 163

LISTOFFIGURESxi

8.3ThenumberofrulesforclinicalconditionsinTRM-ρ3andTRM-ρ1methods,inrelationtothemultivariatetimeseries.......128

8.4Aselectionofdistincttemporalrulesgeneratedfromphysiolog-icaldatainclinicalconditionsusingTRM-ρ3approach......129

8.5Boxplotdiagramoftheoccurrenceratiosbetweenoneclinicalcondition’srulesetandtheotherconditions............131

8.6Theoutputofthepartialtrendsfortwosegmentedtimeseries(HRontopandRRonbottom),showninFigure7.5......134

9.1Foursetsoftimeseriespatterns,presentingtheknownclassesofpatternsinthedataset........................141

9.2Thebipartitegraphpresentingtherelevanceofthefeaturesandthelabelsindatasetoftimeseriespatterns............142

9.3Theconceptualspaceoftimeseriespatterndataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts..................143

9.4Asetofunknownsamplesoftimeseriespatterns.........1449.5Piecharts,showingtheexpertiseleveloftheparticipantsbymea-

suringhowfamiliartheyarewiththeintroducedterminologyforclasslabelsandfeatures.......................148

9.6Theboxplotoftheratingscoresreceivedforeachofthemodelsderivingdescriptionsinpatterndataset..............149

10.1Human-specifiedsemanticfeaturesforanimaldomain......16010.2YetanotherillustrationforthestoryofTheElephantinthe

Dark,adaptedfrom[194]......................163

List of Tables

2.1 Typical modules and tasks of an NLG system . . . . . . . . . . . 29

5.1 The linguistic descriptions derived for the unknown leaf samplesin Figure 5.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 The overall scores calculated from rating responses to the differ-ent models in leaf data set. . . . . . . . . . . . . . . . . . . . . . 84

5.3 Summary of the one-way ANOVA and Wilcoxon tests for therating scores with respect to the models deriving descriptions. . 85

6.1 The summarisation of the most commonly used features of eachwearable sensor data in the literature. . . . . . . . . . . . . . . . 97

7.1 Clinical conditions and their subjects in mimic database, afterremoving unreliable measurements. . . . . . . . . . . . . . . . . 108

8.1 Occurrence ratios of rule sets for each pair of clinical conditionsin multivariate time series HR&RR, using TRM-ρ3. . . . . . . . 130

8.2 Subjects with the same condition in their nearest rule sets. . . . . 1328.3 The instances of linguistic terms used for describing trends. . . . 1338.4 A template-based textual representation of rules with temporal

relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358.5 Textual representation of the acquired rules in Figure 8.4. . . . . 136

9.1 The linguistic descriptions derived for the unknown samples oftime series patterns in Figure 9.4. . . . . . . . . . . . . . . . . . 146

9.2 The overall scores calculated from the rating responses to thedifferent models in pattern data set. . . . . . . . . . . . . . . . . 149


xiii

ListofTables

2.1TypicalmodulesandtasksofanNLGsystem...........29

5.1ThelinguisticdescriptionsderivedfortheunknownleafsamplesinFigure5.4.............................79

5.2Theoverallscorescalculatedfromratingresponsestothediffer-entmodelsinleafdataset......................84

5.3Summaryoftheone-wayANOVAandWilcoxontestsfortheratingscoreswithrespecttothemodelsderivingdescriptions..85

6.1Thesummarisationofthemostcommonlyusedfeaturesofeachwearablesensordataintheliterature................97

7.1Clinicalconditionsandtheirsubjectsinmimicdatabase,afterremovingunreliablemeasurements.................108

8.1OccurrenceratiosofrulesetsforeachpairofclinicalconditionsinmultivariatetimeseriesHR&RR,usingTRM-ρ3........130

8.2Subjectswiththesameconditionintheirnearestrulesets.....1328.3Theinstancesoflinguistictermsusedfordescribingtrends....1338.4Atemplate-basedtextualrepresentationofruleswithtemporal

relations...............................1358.5TextualrepresentationoftheacquiredrulesinFigure8.4.....136

9.1ThelinguisticdescriptionsderivedfortheunknownsamplesoftimeseriespatternsinFigure9.4..................146

9.2Theoverallscorescalculatedfromtheratingresponsestothedifferentmodelsinpatterndataset.................149


xiii

List of Tables

2.1 Typical modules and tasks of an NLG system . . . . . . . . . . . 29

5.1 The linguistic descriptions derived for the unknown leaf samplesin Figure 5.4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.2 The overall scores calculated from rating responses to the differ-ent models in leaf data set. . . . . . . . . . . . . . . . . . . . . . 84


6.1 The summarisation of the most commonly used features of eachwearable sensor data in the literature. . . . . . . . . . . . . . . . 97

7.1 Clinical conditions and their subjects in mimic database, afterremoving unreliable measurements. . . . . . . . . . . . . . . . . 108

8.1 Occurrence ratios of rule sets for each pair of clinical conditionsin multivariate time series HR&RR, using TRM-ρ3. . . . . . . . 130

8.2 Subjects with the same condition in their nearest rule sets. . . . . 1328.3 The instances of linguistic terms used for describing trends. . . . 1338.4 A template-based textual representation of rules with temporal

relations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358.5 Textual representation of the acquired rules in Figure 8.4. . . . . 136

9.1 The linguistic descriptions derived for the unknown samples oftime series patterns in Figure 9.4. . . . . . . . . . . . . . . . . . 146

9.2 The overall scores calculated from the rating responses to thedifferent models in pattern data set. . . . . . . . . . . . . . . . . 149


xiii

ListofTables

2.1TypicalmodulesandtasksofanNLGsystem...........29

5.1ThelinguisticdescriptionsderivedfortheunknownleafsamplesinFigure5.4.............................79

5.2Theoverallscorescalculatedfromratingresponsestothediffer-entmodelsinleafdataset......................84


6.1Thesummarisationofthemostcommonlyusedfeaturesofeachwearablesensordataintheliterature................97

7.1Clinicalconditionsandtheirsubjectsinmimicdatabase,afterremovingunreliablemeasurements.................108

8.1OccurrenceratiosofrulesetsforeachpairofclinicalconditionsinmultivariatetimeseriesHR&RR,usingTRM-ρ3........130

8.2Subjectswiththesameconditionintheirnearestrulesets.....1328.3Theinstancesoflinguistictermsusedfordescribingtrends....1338.4Atemplate-basedtextualrepresentationofruleswithtemporal

relations...............................1358.5TextualrepresentationoftheacquiredrulesinFigure8.4.....136

9.1ThelinguisticdescriptionsderivedfortheunknownsamplesoftimeseriespatternsinFigure9.4..................146

9.2Theoverallscorescalculatedfromtheratingresponsestothedifferentmodelsinpatterndataset.................149


xiii

List of Algorithms

3.1 Feature Subset Ranking . . . . . . . . . . . . . . . . . . . . . . . 413.2 Feature Subset Grouping . . . . . . . . . . . . . . . . . . . . . . 443.3 Concept Representation . . . . . . . . . . . . . . . . . . . . . . . 46

4.1 Inference in Conceptual Space . . . . . . . . . . . . . . . . . . . . 614.2 Inference in Symbol Space . . . . . . . . . . . . . . . . . . . . . . 66

7.1 Partial Trend Detection . . . . . . . . . . . . . . . . . . . . . . . 113

8.1 RuleMatch(r,R, IR) . . . . . . . . . . . . . . . . . . . . . . . . . 1258.2 Occurrence(R1, R2, IR2 ) . . . . . . . . . . . . . . . . . . . . . . . 125

xv

ListofAlgorithms

3.1FeatureSubsetRanking.......................413.2FeatureSubsetGrouping......................443.3ConceptRepresentation.......................46

4.1InferenceinConceptualSpace....................614.2InferenceinSymbolSpace......................66

7.1PartialTrendDetection.......................113

8.1RuleMatch(r,R,IR).........................1258.2Occurrence(R1,R2,IR2).......................125

xv

List of Algorithms

3.1 Feature Subset Ranking . . . . . . . . . . . . . . . . . . . . . . . 413.2 Feature Subset Grouping . . . . . . . . . . . . . . . . . . . . . . 443.3 Concept Representation . . . . . . . . . . . . . . . . . . . . . . . 46

4.1 Inference in Conceptual Space . . . . . . . . . . . . . . . . . . . . 614.2 Inference in Symbol Space . . . . . . . . . . . . . . . . . . . . . . 66

7.1 Partial Trend Detection . . . . . . . . . . . . . . . . . . . . . . . 113

8.1 RuleMatch(r,R, IR) . . . . . . . . . . . . . . . . . . . . . . . . . 1258.2 Occurrence(R1, R2, IR2 ) . . . . . . . . . . . . . . . . . . . . . . . 125

xv

ListofAlgorithms

3.1FeatureSubsetRanking.......................413.2FeatureSubsetGrouping......................443.3ConceptRepresentation.......................46

4.1InferenceinConceptualSpace....................614.2InferenceinSymbolSpace......................66

7.1PartialTrendDetection.......................113

8.1RuleMatch(r,R,IR).........................1258.2Occurrence(R1,R2,IR2).......................125

xv

Chapter 1

Introduction

Rumi, the 13th century Persian poet and teacher of Sufism, has a story calledThe Elephant in the Dark1 in his extensive book of poetry, Masnavi. In

his retelling:

Some Hindus bring an elephant to be exhibited in a dark room. Anumber of men touch and feel the elephant in the dark and, de-pending upon where they touch it, they believe the elephant to belike a water spout (trunk), a fan (ear), a pillar (leg) and a throne(back)... [191].

Figure 1.1: A part of the Rumi’s poem in Persian, together with an illustrationof the story, adapted from [7].

1This parable originated on the Indian subcontinent and is known as “Blind men and an ele-phant” (See Figure 1.1).

1

Chapter1

Introduction

Rumi,the13thcenturyPersianpoetandteacherofSufism,hasastorycalledTheElephantintheDark1inhisextensivebookofpoetry,Masnavi.In

hisretelling:

SomeHindusbringanelephanttobeexhibitedinadarkroom.Anumberofmentouchandfeeltheelephantinthedarkand,de-pendinguponwheretheytouchit,theybelievetheelephanttobelikeawaterspout(trunk),afan(ear),apillar(leg)andathrone(back)...[191].

Figure1.1:ApartoftheRumi’spoeminPersian,togetherwithanillustrationofthestory,adaptedfrom[7].

1ThisparableoriginatedontheIndiansubcontinentandisknownas“Blindmenandanele-phant”(SeeFigure1.1).

1

Chapter 1

Introduction

Rumi, the 13th century Persian poet and teacher of Sufism, has a story calledThe Elephant in the Dark1 in his extensive book of poetry, Masnavi. In

his retelling:

Some Hindus bring an elephant to be exhibited in a dark room. Anumber of men touch and feel the elephant in the dark and, de-pending upon where they touch it, they believe the elephant to belike a water spout (trunk), a fan (ear), a pillar (leg) and a throne(back)... [191].

Figure 1.1: A part of the Rumi’s poem in Persian, together with an illustrationof the story, adapted from [7].

1This parable originated on the Indian subcontinent and is known as “Blind men and an ele-phant” (See Figure 1.1).

1

Chapter1

Introduction

Rumi,the13thcenturyPersianpoetandteacherofSufism,hasastorycalledTheElephantintheDark1inhisextensivebookofpoetry,Masnavi.In

hisretelling:

SomeHindusbringanelephanttobeexhibitedinadarkroom.Anumberofmentouchandfeeltheelephantinthedarkand,de-pendinguponwheretheytouchit,theybelievetheelephanttobelikeawaterspout(trunk),afan(ear),apillar(leg)andathrone(back)...[191].

Figure1.1:ApartoftheRumi’spoeminPersian,togetherwithanillustrationofthestory,adaptedfrom[7].

1ThisparableoriginatedontheIndiansubcontinentandisknownas“Blindmenandanele-phant”(SeeFigure1.1).

1

2 CHAPTER 1. INTRODUCTION

Rumi uses this parable to demonstrate the problem of perception limita-tions. The individuals have their own perceptions of the elephant (an unknownconcept for them) and therefore use their own inference to explain it. Thisproblem is closely related to the problem of describing a concept based onthe perceived information. The behaviour of the men in this story relied on ageneral cognitive process for learning concepts. They first perceived what theycould observe (or more generally, could sense in the case of touching the ele-phant) and they then sought to map or categorise the perceived informationaccording to similar concepts that were known to them. This process is knownas exemplar theory in cognitive science. However, their failure to successfullycharacterise or describe the concept of Elephant was due to the limitations oftheir sensory perceptions, which led to overreaching misinterpretations [1].

Describing unknown observations in natural language appears to be an easytask for humans. Both speakers and hearers have a great deal of common senseunderstanding of the concepts and properties that they have encountered inlife, and that enable them to describe such observations. For example, when J.K. Rowling presents the legendary creature of “Hippogriff” in her book HarryPotter and the Prisoner of Azkaban, she describes it as follows:

Hippogriffs have the bodies, hind legs, and tails of horses, but thefront legs, wings, and heads of giant eagles, with cruel, steel-colouredbeaks and large, brilliantly orange eyes. The talons on their frontlegs were half a foot long and deadly-looking... [190].

Rowling uses known or familiar concepts that are most similar (eagle andhorse), together with perceivable features (orange, long, large, etc.) to explainthe novel creature that she introduces2. In doing so, she uses linguistic termsthat are cognitively understandable for humans.

However, still, deriving descriptions for unknown concepts is no trivial taskin artificial intelligence (AI). The challenge is how an artificial intelligence sys-tem can perform the task of semantically describing unknown observations byrelying on a set of perceived information. This task becomes more crucial ifthe information given to the system is in the form of numeric or non-symbolicmeasurements (e.g., sensor data).

1.1 Motivation

Deriving semantics from real-world numerical observations such as sensor datahas become increasingly important for creating a common understanding ofinformation with humans. Artificial intelligence systems that can augment ob-servations from sensor data in order to create conceptual representations areneeded for applications that require interaction with humans in natural lan-guage. The motivation for the problem of the semantic description of numeri-

2The hippogriff was first mentioned by the Roman poet Virgil in his Eclogues, but the wordhippogriff is derived from the ancient Greek.

2CHAPTER1.INTRODUCTION

Rumiusesthisparabletodemonstratetheproblemofperceptionlimita-tions.Theindividualshavetheirownperceptionsoftheelephant(anunknownconceptforthem)andthereforeusetheirowninferencetoexplainit.Thisproblemiscloselyrelatedtotheproblemofdescribingaconceptbasedontheperceivedinformation.Thebehaviourofthemeninthisstoryreliedonageneralcognitiveprocessforlearningconcepts.Theyfirstperceivedwhattheycouldobserve(ormoregenerally,couldsenseinthecaseoftouchingtheele-phant)andtheythensoughttomaporcategorisetheperceivedinformationaccordingtosimilarconceptsthatwereknowntothem.Thisprocessisknownasexemplartheoryincognitivescience.However,theirfailuretosuccessfullycharacteriseordescribetheconceptofElephantwasduetothelimitationsoftheirsensoryperceptions,whichledtooverreachingmisinterpretations[1].

Describingunknownobservationsinnaturallanguageappearstobeaneasytaskforhumans.Bothspeakersandhearershaveagreatdealofcommonsenseunderstandingoftheconceptsandpropertiesthattheyhaveencounteredinlife,andthatenablethemtodescribesuchobservations.Forexample,whenJ.K.Rowlingpresentsthelegendarycreatureof“Hippogriff”inherbookHarryPotterandthePrisonerofAzkaban,shedescribesitasfollows:

Hippogriffshavethebodies,hindlegs,andtailsofhorses,butthefrontlegs,wings,andheadsofgianteagles,withcruel,steel-colouredbeaksandlarge,brilliantlyorangeeyes.Thetalonsontheirfrontlegswerehalfafootlonganddeadly-looking...[190].

Rowlingusesknownorfamiliarconceptsthataremostsimilar(eagleandhorse),togetherwithperceivablefeatures(orange,long,large,etc.)toexplainthenovelcreaturethatsheintroduces2.Indoingso,sheuseslinguistictermsthatarecognitivelyunderstandableforhumans.

However,still,derivingdescriptionsforunknownconceptsisnotrivialtaskinartificialintelligence(AI).Thechallengeishowanartificialintelligencesys-temcanperformthetaskofsemanticallydescribingunknownobservationsbyrelyingonasetofperceivedinformation.Thistaskbecomesmorecrucialiftheinformationgiventothesystemisintheformofnumericornon-symbolicmeasurements(e.g.,sensordata).

1.1Motivation

Derivingsemanticsfromreal-worldnumericalobservationssuchassensordatahasbecomeincreasinglyimportantforcreatingacommonunderstandingofinformationwithhumans.Artificialintelligencesystemsthatcanaugmentob-servationsfromsensordatainordertocreateconceptualrepresentationsareneededforapplicationsthatrequireinteractionwithhumansinnaturallan-guage.Themotivationfortheproblemofthesemanticdescriptionofnumeri-

2ThehippogriffwasfirstmentionedbytheRomanpoetVirgilinhisEclogues,butthewordhippogriffisderivedfromtheancientGreek.


Rumi uses this parable to demonstrate the problem of perception limita-tions. The individuals have their own perceptions of the elephant (an unknownconcept for them) and therefore use their own inference to explain it. Thisproblem is closely related to the problem of describing a concept based onthe perceived information. The behaviour of the men in this story relied on ageneral cognitive process for learning concepts. They first perceived what theycould observe (or more generally, could sense in the case of touching the ele-phant) and they then sought to map or categorise the perceived informationaccording to similar concepts that were known to them. This process is knownas exemplar theory in cognitive science. However, their failure to successfullycharacterise or describe the concept of Elephant was due to the limitations oftheir sensory perceptions, which led to overreaching misinterpretations [1].

Describing unknown observations in natural language appears to be an easytask for humans. Both speakers and hearers have a great deal of common senseunderstanding of the concepts and properties that they have encountered inlife, and that enable them to describe such observations. For example, when J.K. Rowling presents the legendary creature of “Hippogriff” in her book HarryPotter and the Prisoner of Azkaban, she describes it as follows:

Hippogriffs have the bodies, hind legs, and tails of horses, but thefront legs, wings, and heads of giant eagles, with cruel, steel-colouredbeaks and large, brilliantly orange eyes. The talons on their frontlegs were half a foot long and deadly-looking... [190].

Rowling uses known or familiar concepts that are most similar (eagle andhorse), together with perceivable features (orange, long, large, etc.) to explainthe novel creature that she introduces2. In doing so, she uses linguistic termsthat are cognitively understandable for humans.

However, still, deriving descriptions for unknown concepts is no trivial taskin artificial intelligence (AI). The challenge is how an artificial intelligence sys-tem can perform the task of semantically describing unknown observations byrelying on a set of perceived information. This task becomes more crucial ifthe information given to the system is in the form of numeric or non-symbolicmeasurements (e.g., sensor data).

1.1 Motivation

Deriving semantics from real-world numerical observations such as sensor datahas become increasingly important for creating a common understanding ofinformation with humans. Artificial intelligence systems that can augment ob-servations from sensor data in order to create conceptual representations areneeded for applications that require interaction with humans in natural lan-guage. The motivation for the problem of the semantic description of numeri-

2The hippogriff was first mentioned by the Roman poet Virgil in his Eclogues, but the wordhippogriff is derived from the ancient Greek.


Rumiusesthisparabletodemonstratetheproblemofperceptionlimita-tions.Theindividualshavetheirownperceptionsoftheelephant(anunknownconceptforthem)andthereforeusetheirowninferencetoexplainit.Thisproblemiscloselyrelatedtotheproblemofdescribingaconceptbasedontheperceivedinformation.Thebehaviourofthemeninthisstoryreliedonageneralcognitiveprocessforlearningconcepts.Theyfirstperceivedwhattheycouldobserve(ormoregenerally,couldsenseinthecaseoftouchingtheele-phant)andtheythensoughttomaporcategorisetheperceivedinformationaccordingtosimilarconceptsthatwereknowntothem.Thisprocessisknownasexemplartheoryincognitivescience.However,theirfailuretosuccessfullycharacteriseordescribetheconceptofElephantwasduetothelimitationsoftheirsensoryperceptions,whichledtooverreachingmisinterpretations[1].

Describingunknownobservationsinnaturallanguageappearstobeaneasytaskforhumans.Bothspeakersandhearershaveagreatdealofcommonsenseunderstandingoftheconceptsandpropertiesthattheyhaveencounteredinlife,andthatenablethemtodescribesuchobservations.Forexample,whenJ.K.Rowlingpresentsthelegendarycreatureof“Hippogriff”inherbookHarryPotterandthePrisonerofAzkaban,shedescribesitasfollows:

Hippogriffshavethebodies,hindlegs,andtailsofhorses,butthefrontlegs,wings,andheadsofgianteagles,withcruel,steel-colouredbeaksandlarge,brilliantlyorangeeyes.Thetalonsontheirfrontlegswerehalfafootlonganddeadly-looking...[190].

Rowlingusesknownorfamiliarconceptsthataremostsimilar(eagleandhorse),togetherwithperceivablefeatures(orange,long,large,etc.)toexplainthenovelcreaturethatsheintroduces2.Indoingso,sheuseslinguistictermsthatarecognitivelyunderstandableforhumans.

However,still,derivingdescriptionsforunknownconceptsisnotrivialtaskinartificialintelligence(AI).Thechallengeishowanartificialintelligencesys-temcanperformthetaskofsemanticallydescribingunknownobservationsbyrelyingonasetofperceivedinformation.Thistaskbecomesmorecrucialiftheinformationgiventothesystemisintheformofnumericornon-symbolicmeasurements(e.g.,sensordata).

1.1Motivation

Derivingsemanticsfromreal-worldnumericalobservationssuchassensordatahasbecomeincreasinglyimportantforcreatingacommonunderstandingofinformationwithhumans.Artificialintelligencesystemsthatcanaugmentob-servationsfromsensordatainordertocreateconceptualrepresentationsareneededforapplicationsthatrequireinteractionwithhumansinnaturallan-guage.Themotivationfortheproblemofthesemanticdescriptionofnumeri-

2ThehippogriffwasfirstmentionedbytheRomanpoetVirgilinhisEclogues,butthewordhippogriffisderivedfromtheancientGreek.

1.1. MOTIVATION 3

cal data has its origin in the topic of knowledge representation in AI research.However, the initial motivation in this thesis comes from a real-world appli-cation for analysing sensor data in healthcare monitoring systems. The rest ofthis section explains both the theoretical and the application-based motivationsthat underlie the problem to be considered.

One goal of cognitive science is to construct artificial systems that can un-derstand and model the cognitive activities of humans, such as concept learningand semantic inference [14]. However, a key issue is how the given informationis to be modelled in knowledge representation frameworks [85, 87]. The twoparadigms of symbolic and sub-symbolic representations have been the twodominant (and sometimes competing) approaches to addressing the issue ofrepresentation in AI [85, 219]. Symbolic approaches use explicit symbols asprimitives when performing symbol manipulation in order to model high-levelabstract concepts [155]. Sub-symbolic (connectionist) approaches often focuson the categorisation tasks per se. They process the activation patterns of inputconcepts at the perceptual level, using internally connected units of artificialneural networks [219].

With regard to the task of the semantic description of concepts by means ofperceived data (henceforth, concept description), this thesis highlights two AIproblems. The first is the problem of induction or, more generally, the issue oflearning. Inductive inference performs a generalisation from a limited numberof observations, which infers the characteristics of the concepts. Induction ishighly related to the task of concept learning or concept formation in cognitivescience [84]. The second problem is related to semantics or, in general terms, theissue of explainability or interpretability in AI. Semantic inference is the processof inferring meaningful descriptions or truth conditions from a set of (semanti-cally enriched) information, which is usually represented in the form of logicalor natural sentences. Neither of the representational approaches (symbolic andsub-symbolic) satisfactorily addresses the AI problems noted above (conceptlearning and semantic inference) simultaneously. Concept learning is a diffi-cult task for symbolic approaches, since the symbolic AI has been formalisedon the basis of rule-based representations of the logical source of knowledge,rather than having intrinsically learnt rules from observations [219]. Semanticinference cannot be addressed by the sub-symbolic approaches since there isno interpretable transformation from the low-level information of the model tothe organised high-level symbols. In other words, the sub-symbolic approachesare unable to explain what the emerging learnt model represents [89].

Consequently, the theory of conceptual spaces was introduced by Gärden-fors [86] as a mid-level representation between the symbolic and the sub-symbolicapproaches to addressing both the concept learning and the semantic inferenceproblems [14,116]. The theory of conceptual spaces presents a framework thatconsists of a set of quality dimensions in various domains. These are placedwithin a geometrical structure in order to model, categorise, and represent the

1.1.MOTIVATION3

caldatahasitsorigininthetopicofknowledgerepresentationinAIresearch.However,theinitialmotivationinthisthesiscomesfromareal-worldappli-cationforanalysingsensordatainhealthcaremonitoringsystems.Therestofthissectionexplainsboththetheoreticalandtheapplication-basedmotivationsthatunderlietheproblemtobeconsidered.

Onegoalofcognitivescienceistoconstructartificialsystemsthatcanun-derstandandmodelthecognitiveactivitiesofhumans,suchasconceptlearningandsemanticinference[14].However,akeyissueishowthegiveninformationistobemodelledinknowledgerepresentationframeworks[85,87].Thetwoparadigmsofsymbolicandsub-symbolicrepresentationshavebeenthetwodominant(andsometimescompeting)approachestoaddressingtheissueofrepresentationinAI[85,219].Symbolicapproachesuseexplicitsymbolsasprimitiveswhenperformingsymbolmanipulationinordertomodelhigh-levelabstractconcepts[155].Sub-symbolic(connectionist)approachesoftenfocusonthecategorisationtasksperse.Theyprocesstheactivationpatternsofinputconceptsattheperceptuallevel,usinginternallyconnectedunitsofartificialneuralnetworks[219].

Withregardtothetaskofthesemanticdescriptionofconceptsbymeansofperceiveddata(henceforth,conceptdescription),thisthesishighlightstwoAIproblems.Thefirstistheproblemofinductionor,moregenerally,theissueoflearning.Inductiveinferenceperformsageneralisationfromalimitednumberofobservations,whichinfersthecharacteristicsoftheconcepts.Inductionishighlyrelatedtothetaskofconceptlearningorconceptformationincognitivescience[84].Thesecondproblemisrelatedtosemanticsor,ingeneralterms,theissueofexplainabilityorinterpretabilityinAI.Semanticinferenceistheprocessofinferringmeaningfuldescriptionsortruthconditionsfromasetof(semanti-callyenriched)information,whichisusuallyrepresentedintheformoflogicalornaturalsentences.Neitheroftherepresentationalapproaches(symbolicandsub-symbolic)satisfactorilyaddressestheAIproblemsnotedabove(conceptlearningandsemanticinference)simultaneously.Conceptlearningisadiffi-culttaskforsymbolicapproaches,sincethesymbolicAIhasbeenformalisedonthebasisofrule-basedrepresentationsofthelogicalsourceofknowledge,ratherthanhavingintrinsicallylearntrulesfromobservations[219].Semanticinferencecannotbeaddressedbythesub-symbolicapproachessincethereisnointerpretabletransformationfromthelow-levelinformationofthemodeltotheorganisedhigh-levelsymbols.Inotherwords,thesub-symbolicapproachesareunabletoexplainwhattheemerginglearntmodelrepresents[89].

Consequently,thetheoryofconceptualspaceswasintroducedbyGärden-fors[86]asamid-levelrepresentationbetweenthesymbolicandthesub-symbolicapproachestoaddressingboththeconceptlearningandthesemanticinferenceproblems[14,116].Thetheoryofconceptualspacespresentsaframeworkthatconsistsofasetofqualitydimensionsinvariousdomains.Theseareplacedwithinageometricalstructureinordertomodel,categorise,andrepresentthe

1.1. MOTIVATION 3

cal data has its origin in the topic of knowledge representation in AI research.However, the initial motivation in this thesis comes from a real-world appli-cation for analysing sensor data in healthcare monitoring systems. The rest ofthis section explains both the theoretical and the application-based motivationsthat underlie the problem to be considered.

One goal of cognitive science is to construct artificial systems that can un-derstand and model the cognitive activities of humans, such as concept learningand semantic inference [14]. However, a key issue is how the given informationis to be modelled in knowledge representation frameworks [85, 87]. The twoparadigms of symbolic and sub-symbolic representations have been the twodominant (and sometimes competing) approaches to addressing the issue ofrepresentation in AI [85, 219]. Symbolic approaches use explicit symbols asprimitives when performing symbol manipulation in order to model high-levelabstract concepts [155]. Sub-symbolic (connectionist) approaches often focuson the categorisation tasks per se. They process the activation patterns of inputconcepts at the perceptual level, using internally connected units of artificialneural networks [219].

With regard to the task of the semantic description of concepts by means ofperceived data (henceforth, concept description), this thesis highlights two AIproblems. The first is the problem of induction or, more generally, the issue oflearning. Inductive inference performs a generalisation from a limited numberof observations, which infers the characteristics of the concepts. Induction ishighly related to the task of concept learning or concept formation in cognitivescience [84]. The second problem is related to semantics or, in general terms, theissue of explainability or interpretability in AI. Semantic inference is the processof inferring meaningful descriptions or truth conditions from a set of (semanti-cally enriched) information, which is usually represented in the form of logicalor natural sentences. Neither of the representational approaches (symbolic andsub-symbolic) satisfactorily addresses the AI problems noted above (conceptlearning and semantic inference) simultaneously. Concept learning is a diffi-cult task for symbolic approaches, since the symbolic AI has been formalisedon the basis of rule-based representations of the logical source of knowledge,rather than having intrinsically learnt rules from observations [219]. Semanticinference cannot be addressed by the sub-symbolic approaches since there isno interpretable transformation from the low-level information of the model tothe organised high-level symbols. In other words, the sub-symbolic approachesare unable to explain what the emerging learnt model represents [89].

Consequently, the theory of conceptual spaces was introduced by Gärden-fors [86] as a mid-level representation between the symbolic and the sub-symbolicapproaches to addressing both the concept learning and the semantic inferenceproblems [14,116]. The theory of conceptual spaces presents a framework thatconsists of a set of quality dimensions in various domains. These are placedwithin a geometrical structure in order to model, categorise, and represent the

1.1.MOTIVATION3

caldatahasitsorigininthetopicofknowledgerepresentationinAIresearch.However,theinitialmotivationinthisthesiscomesfromareal-worldappli-cationforanalysingsensordatainhealthcaremonitoringsystems.Therestofthissectionexplainsboththetheoreticalandtheapplication-basedmotivationsthatunderlietheproblemtobeconsidered.

Onegoalofcognitivescienceistoconstructartificialsystemsthatcanun-derstandandmodelthecognitiveactivitiesofhumans,suchasconceptlearningandsemanticinference[14].However,akeyissueishowthegiveninformationistobemodelledinknowledgerepresentationframeworks[85,87].Thetwoparadigmsofsymbolicandsub-symbolicrepresentationshavebeenthetwodominant(andsometimescompeting)approachestoaddressingtheissueofrepresentationinAI[85,219].Symbolicapproachesuseexplicitsymbolsasprimitiveswhenperformingsymbolmanipulationinordertomodelhigh-levelabstractconcepts[155].Sub-symbolic(connectionist)approachesoftenfocusonthecategorisationtasksperse.Theyprocesstheactivationpatternsofinputconceptsattheperceptuallevel,usinginternallyconnectedunitsofartificialneuralnetworks[219].

Withregardtothetaskofthesemanticdescriptionofconceptsbymeansofperceiveddata(henceforth,conceptdescription),thisthesishighlightstwoAIproblems.Thefirstistheproblemofinductionor,moregenerally,theissueoflearning.Inductiveinferenceperformsageneralisationfromalimitednumberofobservations,whichinfersthecharacteristicsoftheconcepts.Inductionishighlyrelatedtothetaskofconceptlearningorconceptformationincognitivescience[84].Thesecondproblemisrelatedtosemanticsor,ingeneralterms,theissueofexplainabilityorinterpretabilityinAI.Semanticinferenceistheprocessofinferringmeaningfuldescriptionsortruthconditionsfromasetof(semanti-callyenriched)information,whichisusuallyrepresentedintheformoflogicalornaturalsentences.Neitheroftherepresentationalapproaches(symbolicandsub-symbolic)satisfactorilyaddressestheAIproblemsnotedabove(conceptlearningandsemanticinference)simultaneously.Conceptlearningisadiffi-culttaskforsymbolicapproaches,sincethesymbolicAIhasbeenformalisedonthebasisofrule-basedrepresentationsofthelogicalsourceofknowledge,ratherthanhavingintrinsicallylearntrulesfromobservations[219].Semanticinferencecannotbeaddressedbythesub-symbolicapproachessincethereisnointerpretabletransformationfromthelow-levelinformationofthemodeltotheorganisedhigh-levelsymbols.Inotherwords,thesub-symbolicapproachesareunabletoexplainwhattheemerginglearntmodelrepresents[89].

Consequently,thetheoryofconceptualspaceswasintroducedbyGärden-fors[86]asamid-levelrepresentationbetweenthesymbolicandthesub-symbolicapproachestoaddressingboththeconceptlearningandthesemanticinferenceproblems[14,116].Thetheoryofconceptualspacespresentsaframeworkthatconsistsofasetofqualitydimensionsinvariousdomains.Theseareplacedwithinageometricalstructureinordertomodel,categorise,andrepresentthe


concepts in a multi-dimensional space [86]. In the literature, conceptual spacesare principally derived in a knowledge-driven manner. These spaces operate onthe assumption that there is prior knowledge from perceptual mechanisms orexperts that manually initialises the elements of the conceptual space (i.e., do-mains, quality dimensions, and concepts’ regions) [10,189]. However, since thisthesis relies on observed input data, the motivational challenge arises of how toautomatically construct a conceptual space from the given information [131]in order to perform concept learning and semantic inference tasks. Performingthese task are the required steps to address the problem of concept description(especially to describe unknown observations). This is an important motiva-tion, due to a growing class of problems that involves more complicated inputobservations. These problems deal with raw sensor data that have little or noprior knowledge concerning their semantic significance [188]. Therefore, spec-ifying the interpretable elements of a representational model for such problemsis no trivial task.

New applications in several domains are increasingly required to rely on au-tomatic methods for forming concepts directly from sensor data. One area thatsensor data is crucially used in is the field of health monitoring, particularly inclinical conditions. Monitoring the vital signs of the subjects (whether healthyor unhealthy) is essential if the medical domain is to identify the different be-haviours of the health parameters as symptoms of various medical diseases [28].Such physiological parameters include heart rate, respiration rate, blood pres-sure, etc. Various sensors have been developed to measure these vital signs inboth clinical conditions and home healthcare systems, and these accumulatea massive amount of data. With regard to wearable sensors, the continuousparameters in the form of time series signals are of particular interest in thisresearch (in contrast to discrete ones), since it is more critical to consider thehigh resolution sequence of information (with very small time-stamps).

Critical challenges in the field of health monitoring include not only statis-tically mining massive amount of physiological time series data, but also dis-covering understandable interpretations for the extracted information [32]. Anew aspect of analysing sensor data will involve going beyond expert knowl-edge and recognising information (e.g., events, patterns, anomalies) that is notpre-defined by the system. This aspect will lead to the analysis phase being con-ducted in a data-driven way in order to reveal information that was not pre-viously seen, but is worth analysing and interpreting. A motivational exampleis the exploitation of interesting trends and patterns in sensor time series data.A knowledge-based system may look for the known pre-defined trends askedby the experts (e.g., peaks of heart rate signals). However, one can also anal-yse the data itself in order to detect interesting and meaningful patterns foundthroughout the data (e.g., repeated sudden drops of heart rate while sleeping).These are not necessarily requested by or even visible to the experts, but theyare worth being reported. Nevertheless, the primary challenge remains finding


conceptsinamulti-dimensionalspace[86].Intheliterature,conceptualspacesareprincipallyderivedinaknowledge-drivenmanner.Thesespacesoperateontheassumptionthatthereispriorknowledgefromperceptualmechanismsorexpertsthatmanuallyinitialisestheelementsoftheconceptualspace(i.e.,do-mains,qualitydimensions,andconcepts’regions)[10,189].However,sincethisthesisreliesonobservedinputdata,themotivationalchallengearisesofhowtoautomaticallyconstructaconceptualspacefromthegiveninformation[131]inordertoperformconceptlearningandsemanticinferencetasks.Performingthesetaskaretherequiredstepstoaddresstheproblemofconceptdescription(especiallytodescribeunknownobservations).Thisisanimportantmotiva-tion,duetoagrowingclassofproblemsthatinvolvesmorecomplicatedinputobservations.Theseproblemsdealwithrawsensordatathathavelittleornopriorknowledgeconcerningtheirsemanticsignificance[188].Therefore,spec-ifyingtheinterpretableelementsofarepresentationalmodelforsuchproblemsisnotrivialtask.

Newapplicationsinseveraldomainsareincreasinglyrequiredtorelyonau-tomaticmethodsforformingconceptsdirectlyfromsensordata.Oneareathatsensordataiscruciallyusedinisthefieldofhealthmonitoring,particularlyinclinicalconditions.Monitoringthevitalsignsofthesubjects(whetherhealthyorunhealthy)isessentialifthemedicaldomainistoidentifythedifferentbe-havioursofthehealthparametersassymptomsofvariousmedicaldiseases[28].Suchphysiologicalparametersincludeheartrate,respirationrate,bloodpres-sure,etc.Varioussensorshavebeendevelopedtomeasurethesevitalsignsinbothclinicalconditionsandhomehealthcaresystems,andtheseaccumulateamassiveamountofdata.Withregardtowearablesensors,thecontinuousparametersintheformoftimeseriessignalsareofparticularinterestinthisresearch(incontrasttodiscreteones),sinceitismorecriticaltoconsiderthehighresolutionsequenceofinformation(withverysmalltime-stamps).

Criticalchallengesinthefieldofhealthmonitoringincludenotonlystatis-ticallyminingmassiveamountofphysiologicaltimeseriesdata,butalsodis-coveringunderstandableinterpretationsfortheextractedinformation[32].Anewaspectofanalysingsensordatawillinvolvegoingbeyondexpertknowl-edgeandrecognisinginformation(e.g.,events,patterns,anomalies)thatisnotpre-definedbythesystem.Thisaspectwillleadtotheanalysisphasebeingcon-ductedinadata-drivenwayinordertorevealinformationthatwasnotpre-viouslyseen,butisworthanalysingandinterpreting.Amotivationalexampleistheexploitationofinterestingtrendsandpatternsinsensortimeseriesdata.Aknowledge-basedsystemmaylookfortheknownpre-definedtrendsaskedbytheexperts(e.g.,peaksofheartratesignals).However,onecanalsoanal-ysethedataitselfinordertodetectinterestingandmeaningfulpatternsfoundthroughoutthedata(e.g.,repeatedsuddendropsofheartratewhilesleeping).Thesearenotnecessarilyrequestedbyorevenvisibletotheexperts,buttheyareworthbeingreported.Nevertheless,theprimarychallengeremainsfinding


concepts in a multi-dimensional space [86]. In the literature, conceptual spacesare principally derived in a knowledge-driven manner. These spaces operate onthe assumption that there is prior knowledge from perceptual mechanisms orexperts that manually initialises the elements of the conceptual space (i.e., do-mains, quality dimensions, and concepts’ regions) [10,189]. However, since thisthesis relies on observed input data, the motivational challenge arises of how toautomatically construct a conceptual space from the given information [131]in order to perform concept learning and semantic inference tasks. Performingthese task are the required steps to address the problem of concept description(especially to describe unknown observations). This is an important motiva-tion, due to a growing class of problems that involves more complicated inputobservations. These problems deal with raw sensor data that have little or noprior knowledge concerning their semantic significance [188]. Therefore, spec-ifying the interpretable elements of a representational model for such problemsis no trivial task.

New applications in several domains are increasingly required to rely on au-tomatic methods for forming concepts directly from sensor data. One area thatsensor data is crucially used in is the field of health monitoring, particularly inclinical conditions. Monitoring the vital signs of the subjects (whether healthyor unhealthy) is essential if the medical domain is to identify the different be-haviours of the health parameters as symptoms of various medical diseases [28].Such physiological parameters include heart rate, respiration rate, blood pres-sure, etc. Various sensors have been developed to measure these vital signs inboth clinical conditions and home healthcare systems, and these accumulatea massive amount of data. With regard to wearable sensors, the continuousparameters in the form of time series signals are of particular interest in thisresearch (in contrast to discrete ones), since it is more critical to consider thehigh resolution sequence of information (with very small time-stamps).

Critical challenges in the field of health monitoring include not only statis-tically mining massive amount of physiological time series data, but also dis-covering understandable interpretations for the extracted information [32]. Anew aspect of analysing sensor data will involve going beyond expert knowl-edge and recognising information (e.g., events, patterns, anomalies) that is notpre-defined by the system. This aspect will lead to the analysis phase being con-ducted in a data-driven way in order to reveal information that was not pre-viously seen, but is worth analysing and interpreting. A motivational exampleis the exploitation of interesting trends and patterns in sensor time series data.A knowledge-based system may look for the known pre-defined trends askedby the experts (e.g., peaks of heart rate signals). However, one can also anal-yse the data itself in order to detect interesting and meaningful patterns foundthroughout the data (e.g., repeated sudden drops of heart rate while sleeping).These are not necessarily requested by or even visible to the experts, but theyare worth being reported. Nevertheless, the primary challenge remains finding


conceptsinamulti-dimensionalspace[86].Intheliterature,conceptualspacesareprincipallyderivedinaknowledge-drivenmanner.Thesespacesoperateontheassumptionthatthereispriorknowledgefromperceptualmechanismsorexpertsthatmanuallyinitialisestheelementsoftheconceptualspace(i.e.,do-mains,qualitydimensions,andconcepts’regions)[10,189].However,sincethisthesisreliesonobservedinputdata,themotivationalchallengearisesofhowtoautomaticallyconstructaconceptualspacefromthegiveninformation[131]inordertoperformconceptlearningandsemanticinferencetasks.Performingthesetaskaretherequiredstepstoaddresstheproblemofconceptdescription(especiallytodescribeunknownobservations).Thisisanimportantmotiva-tion,duetoagrowingclassofproblemsthatinvolvesmorecomplicatedinputobservations.Theseproblemsdealwithrawsensordatathathavelittleornopriorknowledgeconcerningtheirsemanticsignificance[188].Therefore,spec-ifyingtheinterpretableelementsofarepresentationalmodelforsuchproblemsisnotrivialtask.

Newapplicationsinseveraldomainsareincreasinglyrequiredtorelyonau-tomaticmethodsforformingconceptsdirectlyfromsensordata.Oneareathatsensordataiscruciallyusedinisthefieldofhealthmonitoring,particularlyinclinicalconditions.Monitoringthevitalsignsofthesubjects(whetherhealthyorunhealthy)isessentialifthemedicaldomainistoidentifythedifferentbe-havioursofthehealthparametersassymptomsofvariousmedicaldiseases[28].Suchphysiologicalparametersincludeheartrate,respirationrate,bloodpres-sure,etc.Varioussensorshavebeendevelopedtomeasurethesevitalsignsinbothclinicalconditionsandhomehealthcaresystems,andtheseaccumulateamassiveamountofdata.Withregardtowearablesensors,thecontinuousparametersintheformoftimeseriessignalsareofparticularinterestinthisresearch(incontrasttodiscreteones),sinceitismorecriticaltoconsiderthehighresolutionsequenceofinformation(withverysmalltime-stamps).

Criticalchallengesinthefieldofhealthmonitoringincludenotonlystatis-ticallyminingmassiveamountofphysiologicaltimeseriesdata,butalsodis-coveringunderstandableinterpretationsfortheextractedinformation[32].Anewaspectofanalysingsensordatawillinvolvegoingbeyondexpertknowl-edgeandrecognisinginformation(e.g.,events,patterns,anomalies)thatisnotpre-definedbythesystem.Thisaspectwillleadtotheanalysisphasebeingcon-ductedinadata-drivenwayinordertorevealinformationthatwasnotpre-viouslyseen,butisworthanalysingandinterpreting.Amotivationalexampleistheexploitationofinterestingtrendsandpatternsinsensortimeseriesdata.Aknowledge-basedsystemmaylookfortheknownpre-definedtrendsaskedbytheexperts(e.g.,peaksofheartratesignals).However,onecanalsoanal-ysethedataitselfinordertodetectinterestingandmeaningfulpatternsfoundthroughoutthedata(e.g.,repeatedsuddendropsofheartratewhilesleeping).Thesearenotnecessarilyrequestedbyorevenvisibletotheexperts,buttheyareworthbeingreported.Nevertheless,theprimarychallengeremainsfinding

1.2. PROBLEM STATEMENT 5

a suitable way to describe such data-driven extracted information. Here is thepoint where this thesis argues that a semantic representation can play the role ofmodelling physiological time series in order to describe unknown observationsor patterns in natural language.

1.2 Problem Statement

The problem of semantically modelling and describing the unknown observa-tion, which is termed the general semantic representation problem, forms thecore of this research. This thesis first considers the task of semantic representa-tion in describing the numerical observations (i.e., the theoretical focus of thisthesis). The semantic representation task investigates representational modelsin order to be able to bind perceived numerical data as input into a set of lin-guistic characterisations as output.

In addition to considering this problem at the theoretical level, its instanti-ation is also presented with regard to a practical problem in a real-world ap-plication. This research proposes a specific use of semantic representation forthe task of linguistically describing the unknown time series patterns derivedfrom physiological sensor data (i.e., the application focus of this thesis). Theproposed approach investigates data mining methods for analysing a set of rawphysiological time series data as input, and it considers linguistic descriptionapproaches to generate a set of human-readable natural language text as out-put. Figure 1.2 presents a schematic overview of the tasks to be performed inboth the theoretical focus and the application framework of this research.

The primary assumption for the stated problem is that there is no adequateexpert or domain knowledge to be fed to the model in order to describe unseenbut interesting observations extracted from a data set. Therefore, the proposedapproach relies on the observed data and its own features in order to providelinguistic descriptions. This means that the objective is not to describe the obser-vations on the basis of a set of pre-defined knowledge within a given knowledgerepresentation (such as describing an eagle or a horse using an ontology of ani-mals that perfectly defines the concepts of eagle and horse). Rather, the aim is tobuild a semantic representation of the domain using the known data set and itsproperties. Moreover, it is to extract and characterise the unseen but interestingobservations within this representation (such as describing a hippogriff withina semantic representation of animals that includes the known concepts of eagleand horse, but does not necessarily include the new concept of hippogriff).

All the tasks defined within this solution have some assumptions regardingtheir inputs and outputs. For the semantic representation task, it is assumedthat the input is a set of perceived or processed numerical information. Thisinformation is in the form of human understandable attributes (known as se-mantic features). Depending on the format of the raw input data, the semanticfeatures can be either directly captured or be computed from the given data.Skin colour is an example of the former type of features that can be included

1.2.PROBLEMSTATEMENT5

asuitablewaytodescribesuchdata-drivenextractedinformation.Hereisthepointwherethisthesisarguesthatasemanticrepresentationcanplaytheroleofmodellingphysiologicaltimeseriesinordertodescribeunknownobservationsorpatternsinnaturallanguage.

1.2ProblemStatement

Theproblemofsemanticallymodellinganddescribingtheunknownobserva-tion,whichistermedthegeneralsemanticrepresentationproblem,formsthecoreofthisresearch.Thisthesisfirstconsidersthetaskofsemanticrepresenta-tionindescribingthenumericalobservations(i.e.,thetheoreticalfocusofthisthesis).Thesemanticrepresentationtaskinvestigatesrepresentationalmodelsinordertobeabletobindperceivednumericaldataasinputintoasetoflin-guisticcharacterisationsasoutput.

Inadditiontoconsideringthisproblematthetheoreticallevel,itsinstanti-ationisalsopresentedwithregardtoapracticalprobleminareal-worldap-plication.Thisresearchproposesaspecificuseofsemanticrepresentationforthetaskoflinguisticallydescribingtheunknowntimeseriespatternsderivedfromphysiologicalsensordata(i.e.,theapplicationfocusofthisthesis).Theproposedapproachinvestigatesdataminingmethodsforanalysingasetofrawphysiologicaltimeseriesdataasinput,anditconsiderslinguisticdescriptionapproachestogenerateasetofhuman-readablenaturallanguagetextasout-put.Figure1.2presentsaschematicoverviewofthetaskstobeperformedinboththetheoreticalfocusandtheapplicationframeworkofthisresearch.

Theprimaryassumptionforthestatedproblemisthatthereisnoadequateexpertordomainknowledgetobefedtothemodelinordertodescribeunseenbutinterestingobservationsextractedfromadataset.Therefore,theproposedapproachreliesontheobserveddataanditsownfeaturesinordertoprovidelinguisticdescriptions.Thismeansthattheobjectiveisnottodescribetheobser-vationsonthebasisofasetofpre-definedknowledgewithinagivenknowledgerepresentation(suchasdescribinganeagleorahorseusinganontologyofani-malsthatperfectlydefinestheconceptsofeagleandhorse).Rather,theaimistobuildasemanticrepresentationofthedomainusingtheknowndatasetanditsproperties.Moreover,itistoextractandcharacterisetheunseenbutinterestingobservationswithinthisrepresentation(suchasdescribingahippogriffwithinasemanticrepresentationofanimalsthatincludestheknownconceptsofeagleandhorse,butdoesnotnecessarilyincludethenewconceptofhippogriff).

Allthetasksdefinedwithinthissolutionhavesomeassumptionsregardingtheirinputsandoutputs.Forthesemanticrepresentationtask,itisassumedthattheinputisasetofperceivedorprocessednumericalinformation.Thisinformationisintheformofhumanunderstandableattributes(knownasse-manticfeatures).Dependingontheformatoftherawinputdata,thesemanticfeaturescanbeeitherdirectlycapturedorbecomputedfromthegivendata.Skincolourisanexampleoftheformertypeoffeaturesthatcanbeincluded

1.2. PROBLEM STATEMENT 5

a suitable way to describe such data-driven extracted information. Here is thepoint where this thesis argues that a semantic representation can play the role ofmodelling physiological time series in order to describe unknown observationsor patterns in natural language.

1.2 Problem Statement

The problem of semantically modelling and describing the unknown observa-tion, which is termed the general semantic representation problem, forms thecore of this research. This thesis first considers the task of semantic representa-tion in describing the numerical observations (i.e., the theoretical focus of thisthesis). The semantic representation task investigates representational modelsin order to be able to bind perceived numerical data as input into a set of lin-guistic characterisations as output.

In addition to considering this problem at the theoretical level, its instanti-ation is also presented with regard to a practical problem in a real-world ap-plication. This research proposes a specific use of semantic representation forthe task of linguistically describing the unknown time series patterns derivedfrom physiological sensor data (i.e., the application focus of this thesis). Theproposed approach investigates data mining methods for analysing a set of rawphysiological time series data as input, and it considers linguistic descriptionapproaches to generate a set of human-readable natural language text as out-put. Figure 1.2 presents a schematic overview of the tasks to be performed inboth the theoretical focus and the application framework of this research.

The primary assumption for the stated problem is that there is no adequateexpert or domain knowledge to be fed to the model in order to describe unseenbut interesting observations extracted from a data set. Therefore, the proposedapproach relies on the observed data and its own features in order to providelinguistic descriptions. This means that the objective is not to describe the obser-vations on the basis of a set of pre-defined knowledge within a given knowledgerepresentation (such as describing an eagle or a horse using an ontology of ani-mals that perfectly defines the concepts of eagle and horse). Rather, the aim is tobuild a semantic representation of the domain using the known data set and itsproperties. Moreover, it is to extract and characterise the unseen but interestingobservations within this representation (such as describing a hippogriff withina semantic representation of animals that includes the known concepts of eagleand horse, but does not necessarily include the new concept of hippogriff).

All the tasks defined within this solution have some assumptions regardingtheir inputs and outputs. For the semantic representation task, it is assumedthat the input is a set of perceived or processed numerical information. Thisinformation is in the form of human understandable attributes (known as se-mantic features). Depending on the format of the raw input data, the semanticfeatures can be either directly captured or be computed from the given data.Skin colour is an example of the former type of features that can be included

1.2.PROBLEMSTATEMENT5

asuitablewaytodescribesuchdata-drivenextractedinformation.Hereisthepointwherethisthesisarguesthatasemanticrepresentationcanplaytheroleofmodellingphysiologicaltimeseriesinordertodescribeunknownobservationsorpatternsinnaturallanguage.

1.2ProblemStatement

Theproblemofsemanticallymodellinganddescribingtheunknownobserva-tion,whichistermedthegeneralsemanticrepresentationproblem,formsthecoreofthisresearch.Thisthesisfirstconsidersthetaskofsemanticrepresenta-tionindescribingthenumericalobservations(i.e.,thetheoreticalfocusofthisthesis).Thesemanticrepresentationtaskinvestigatesrepresentationalmodelsinordertobeabletobindperceivednumericaldataasinputintoasetoflin-guisticcharacterisationsasoutput.

Inadditiontoconsideringthisproblematthetheoreticallevel,itsinstanti-ationisalsopresentedwithregardtoapracticalprobleminareal-worldap-plication.Thisresearchproposesaspecificuseofsemanticrepresentationforthetaskoflinguisticallydescribingtheunknowntimeseriespatternsderivedfromphysiologicalsensordata(i.e.,theapplicationfocusofthisthesis).Theproposedapproachinvestigatesdataminingmethodsforanalysingasetofrawphysiologicaltimeseriesdataasinput,anditconsiderslinguisticdescriptionapproachestogenerateasetofhuman-readablenaturallanguagetextasout-put.Figure1.2presentsaschematicoverviewofthetaskstobeperformedinboththetheoreticalfocusandtheapplicationframeworkofthisresearch.

Theprimaryassumptionforthestatedproblemisthatthereisnoadequateexpertordomainknowledgetobefedtothemodelinordertodescribeunseenbutinterestingobservationsextractedfromadataset.Therefore,theproposedapproachreliesontheobserveddataanditsownfeaturesinordertoprovidelinguisticdescriptions.Thismeansthattheobjectiveisnottodescribetheobser-vationsonthebasisofasetofpre-definedknowledgewithinagivenknowledgerepresentation(suchasdescribinganeagleorahorseusinganontologyofani-malsthatperfectlydefinestheconceptsofeagleandhorse).Rather,theaimistobuildasemanticrepresentationofthedomainusingtheknowndatasetanditsproperties.Moreover,itistoextractandcharacterisetheunseenbutinterestingobservationswithinthisrepresentation(suchasdescribingahippogriffwithinasemanticrepresentationofanimalsthatincludestheknownconceptsofeagleandhorse,butdoesnotnecessarilyincludethenewconceptofhippogriff).

Allthetasksdefinedwithinthissolutionhavesomeassumptionsregardingtheirinputsandoutputs.Forthesemanticrepresentationtask,itisassumedthattheinputisasetofperceivedorprocessednumericalinformation.Thisinformationisintheformofhumanunderstandableattributes(knownasse-manticfeatures).Dependingontheformatoftherawinputdata,thesemanticfeaturescanbeeitherdirectlycapturedorbecomputedfromthegivendata.Skincolourisanexampleoftheformertypeoffeaturesthatcanbeincluded


Figure 1.2: A schematic overview of the tasks to be performed in both theoret-ical (inner box) and application (outer box) focuses of the thesis.

in a data set of animals. However, the fluctuation of a signal is an example ofthe latter features that can be computed for a time series in a data set of sensorrecords. Therefore, the considered application for the physiological sensor datainvolves data analysis phase in order to extract the meaningful features that tobe supplied to the semantic representation. Nevertheless, the proposed modelfor the task of semantic representation can be utilised on different types of datasets that include semantic features in their observations.

The output of the semantic representation is assumed to be a set of in-ferred linguistic information that characterises the input numerical observa-tions. These linguistic characterisations (i.e., words) further need to be plannedand realised into natural language messages (i.e., sentences) by means of natu-ral language generation (NLG) systems. Although the text generation phase isparticularised as a task in the framework for physiological sensor data, it cangenerally be applied to any linguistic characterisation that emerges from the se-mantic representation model. It is worth to mention that although the focus ofthis research is on the specific set of physiological time series data (which wasthe initial motivation for this thesis from the field of healthcare monitoring),the development of the framework is applicable as a data-to-text system in avariety of applications that deal with the problem of the linguistic descriptionof data.

1.3 Research Question

The overall objective that this dissertation aims to achieve can be expressed asfollows:

O: “Proposing data-driven approaches beyond expert knowledge to performthe tasks of (1) mining numerical observation (e.g., sensor data), and (2)the semantic representation of the derived information, in order to (3)describe unseen but interesting observations in natural language.”


Figure1.2:Aschematicoverviewofthetaskstobeperformedinboththeoret-ical(innerbox)andapplication(outerbox)focusesofthethesis.

inadatasetofanimals.However,thefluctuationofasignalisanexampleofthelatterfeaturesthatcanbecomputedforatimeseriesinadatasetofsensorrecords.Therefore,theconsideredapplicationforthephysiologicalsensordatainvolvesdataanalysisphaseinordertoextractthemeaningfulfeaturesthattobesuppliedtothesemanticrepresentation.Nevertheless,theproposedmodelforthetaskofsemanticrepresentationcanbeutilisedondifferenttypesofdatasetsthatincludesemanticfeaturesintheirobservations.

Theoutputofthesemanticrepresentationisassumedtobeasetofin-ferredlinguisticinformationthatcharacterisestheinputnumericalobserva-tions.Theselinguisticcharacterisations(i.e.,words)furtherneedtobeplannedandrealisedintonaturallanguagemessages(i.e.,sentences)bymeansofnatu-rallanguagegeneration(NLG)systems.Althoughthetextgenerationphaseisparticularisedasataskintheframeworkforphysiologicalsensordata,itcangenerallybeappliedtoanylinguisticcharacterisationthatemergesfromthese-manticrepresentationmodel.Itisworthtomentionthatalthoughthefocusofthisresearchisonthespecificsetofphysiologicaltimeseriesdata(whichwastheinitialmotivationforthisthesisfromthefieldofhealthcaremonitoring),thedevelopmentoftheframeworkisapplicableasadata-to-textsysteminavarietyofapplicationsthatdealwiththeproblemofthelinguisticdescriptionofdata.

1.3ResearchQuestion

Theoverallobjectivethatthisdissertationaimstoachievecanbeexpressedasfollows:

O:“Proposingdata-drivenapproachesbeyondexpertknowledgetoperformthetasksof(1)miningnumericalobservation(e.g.,sensordata),and(2)thesemanticrepresentationofthederivedinformation,inorderto(3)describeunseenbutinterestingobservationsinnaturallanguage.”


Figure 1.2: A schematic overview of the tasks to be performed in both theoret-ical (inner box) and application (outer box) focuses of the thesis.

in a data set of animals. However, the fluctuation of a signal is an example ofthe latter features that can be computed for a time series in a data set of sensorrecords. Therefore, the considered application for the physiological sensor datainvolves data analysis phase in order to extract the meaningful features that tobe supplied to the semantic representation. Nevertheless, the proposed modelfor the task of semantic representation can be utilised on different types of datasets that include semantic features in their observations.

The output of the semantic representation is assumed to be a set of in-ferred linguistic information that characterises the input numerical observa-tions. These linguistic characterisations (i.e., words) further need to be plannedand realised into natural language messages (i.e., sentences) by means of natu-ral language generation (NLG) systems. Although the text generation phase isparticularised as a task in the framework for physiological sensor data, it cangenerally be applied to any linguistic characterisation that emerges from the se-mantic representation model. It is worth to mention that although the focus ofthis research is on the specific set of physiological time series data (which wasthe initial motivation for this thesis from the field of healthcare monitoring),the development of the framework is applicable as a data-to-text system in avariety of applications that deal with the problem of the linguistic descriptionof data.

1.3 Research Question

The overall objective that this dissertation aims to achieve can be expressed asfollows:

O: “Proposing data-driven approaches beyond expert knowledge to performthe tasks of (1) mining numerical observation (e.g., sensor data), and (2)the semantic representation of the derived information, in order to (3)describe unseen but interesting observations in natural language.”


Figure1.2:Aschematicoverviewofthetaskstobeperformedinboththeoret-ical(innerbox)andapplication(outerbox)focusesofthethesis.

inadatasetofanimals.However,thefluctuationofasignalisanexampleofthelatterfeaturesthatcanbecomputedforatimeseriesinadatasetofsensorrecords.Therefore,theconsideredapplicationforthephysiologicalsensordatainvolvesdataanalysisphaseinordertoextractthemeaningfulfeaturesthattobesuppliedtothesemanticrepresentation.Nevertheless,theproposedmodelforthetaskofsemanticrepresentationcanbeutilisedondifferenttypesofdatasetsthatincludesemanticfeaturesintheirobservations.

Theoutputofthesemanticrepresentationisassumedtobeasetofin-ferredlinguisticinformationthatcharacterisestheinputnumericalobserva-tions.Theselinguisticcharacterisations(i.e.,words)furtherneedtobeplannedandrealisedintonaturallanguagemessages(i.e.,sentences)bymeansofnatu-rallanguagegeneration(NLG)systems.Althoughthetextgenerationphaseisparticularisedasataskintheframeworkforphysiologicalsensordata,itcangenerallybeappliedtoanylinguisticcharacterisationthatemergesfromthese-manticrepresentationmodel.Itisworthtomentionthatalthoughthefocusofthisresearchisonthespecificsetofphysiologicaltimeseriesdata(whichwastheinitialmotivationforthisthesisfromthefieldofhealthcaremonitoring),thedevelopmentoftheframeworkisapplicableasadata-to-textsysteminavarietyofapplicationsthatdealwiththeproblemofthelinguisticdescriptionofdata.

1.3ResearchQuestion

Theoverallobjectivethatthisdissertationaimstoachievecanbeexpressedasfollows:

O:“Proposingdata-drivenapproachesbeyondexpertknowledgetoperformthetasksof(1)miningnumericalobservation(e.g.,sensordata),and(2)thesemanticrepresentationofthederivedinformation,inorderto(3)describeunseenbutinterestingobservationsinnaturallanguage.”

1.4. CONTRIBUTIONS 7

As noted in the problem statement, the semantic representation problemdeals with a set of derived information (perceived or processed features). There-fore, one research question related to the theoretical part of the research focuseson the role of semantic representation and linguistic description, which is re-lated to tasks (2) and (3) in objective O. Generally speaking, the question is

R1: “How can perceived numerical information be semantically representedand described in a linguistic form?”

This can be divided into more specific research questions as follows:

• How can a semantic representation of a numerical data set automaticallybe created in a data-driven manner, based on the observed exemplars andtheir semantic features?

• How can these observations be conceptualised within the representationconstructed in order to form the concepts of the data set’s domain?

• How can this representation be utilised in order to infer linguistic char-acterisations for a set of new unknown observations in natural language?

Likewise, with regard to the problem statement about the application partof this thesis, the framework is dedicated to turning raw numerical data intonatural language text. More specifically, however, it is concerned with consid-ering real-world data sets from the field of healthcare (i.e., time series patternsderived from physiological sensor data). Therefore, the research question in theapplication part focuses on the role of data analysis and the linguistic descrip-tion of patterns, which is related to tasks (1) and (3) in objective O. Generallyspeaking, the question is

R2: “How can unseen but interesting patterns in physiological sensor data beextracted, represented, and linguistically described?”

This can also be divided into more specific research questions as follows:

• How can data-driven approaches be applied in order 1) to extract distinc-tive time series patterns in different clinical settings, and 2) to mine thetemporal relations between multi-channels of physiological sensor data?

• How can a data-driven conceptual space be built as a semantic represen-tation of the time series patterns in order to infer linguistic characterisa-tions for the patterns?

• How can meaningful, interesting, and useful messages be generated innatural language for numerical patterns and their temporal relations us-ing the proposed semantic model?

1.4 Contributions

Given the general objective O, the core of the contributions in this thesis isthe notion of data-driven. This applies to both the research questions, which

1.4.CONTRIBUTIONS7

Asnotedintheproblemstatement,thesemanticrepresentationproblemdealswithasetofderivedinformation(perceivedorprocessedfeatures).There-fore,oneresearchquestionrelatedtothetheoreticalpartoftheresearchfocusesontheroleofsemanticrepresentationandlinguisticdescription,whichisre-latedtotasks(2)and(3)inobjectiveO.Generallyspeaking,thequestionis

R1:“Howcanperceivednumericalinformationbesemanticallyrepresentedanddescribedinalinguisticform?”

Thiscanbedividedintomorespecificresearchquestionsasfollows:

•Howcanasemanticrepresentationofanumericaldatasetautomaticallybecreatedinadata-drivenmanner,basedontheobservedexemplarsandtheirsemanticfeatures?

•Howcantheseobservationsbeconceptualisedwithintherepresentationconstructedinordertoformtheconceptsofthedataset’sdomain?

•Howcanthisrepresentationbeutilisedinordertoinferlinguisticchar-acterisationsforasetofnewunknownobservationsinnaturallanguage?

Likewise,withregardtotheproblemstatementabouttheapplicationpartofthisthesis,theframeworkisdedicatedtoturningrawnumericaldataintonaturallanguagetext.Morespecifically,however,itisconcernedwithconsid-eringreal-worlddatasetsfromthefieldofhealthcare(i.e.,timeseriespatternsderivedfromphysiologicalsensordata).Therefore,theresearchquestionintheapplicationpartfocusesontheroleofdataanalysisandthelinguisticdescrip-tionofpatterns,whichisrelatedtotasks(1)and(3)inobjectiveO.Generallyspeaking,thequestionis

R2:“Howcanunseenbutinterestingpatternsinphysiologicalsensordatabeextracted,represented,andlinguisticallydescribed?”

Thiscanalsobedividedintomorespecificresearchquestionsasfollows:

•Howcandata-drivenapproachesbeappliedinorder1)toextractdistinc-tivetimeseriespatternsindifferentclinicalsettings,and2)tominethetemporalrelationsbetweenmulti-channelsofphysiologicalsensordata?

•Howcanadata-drivenconceptualspacebebuiltasasemanticrepresen-tationofthetimeseriespatternsinordertoinferlinguisticcharacterisa-tionsforthepatterns?

•Howcanmeaningful,interesting,andusefulmessagesbegeneratedinnaturallanguagefornumericalpatternsandtheirtemporalrelationsus-ingtheproposedsemanticmodel?

1.4Contributions

GiventhegeneralobjectiveO,thecoreofthecontributionsinthisthesisisthenotionofdata-driven.Thisappliestoboththeresearchquestions,which

1.4. CONTRIBUTIONS 7

As noted in the problem statement, the semantic representation problemdeals with a set of derived information (perceived or processed features). There-fore, one research question related to the theoretical part of the research focuseson the role of semantic representation and linguistic description, which is re-lated to tasks (2) and (3) in objective O. Generally speaking, the question is

R1: “How can perceived numerical information be semantically representedand described in a linguistic form?”

This can be divided into more specific research questions as follows:

• How can a semantic representation of a numerical data set automaticallybe created in a data-driven manner, based on the observed exemplars andtheir semantic features?

• How can these observations be conceptualised within the representationconstructed in order to form the concepts of the data set’s domain?

• How can this representation be utilised in order to infer linguistic char-acterisations for a set of new unknown observations in natural language?

Likewise, with regard to the problem statement about the application partof this thesis, the framework is dedicated to turning raw numerical data intonatural language text. More specifically, however, it is concerned with consid-ering real-world data sets from the field of healthcare (i.e., time series patternsderived from physiological sensor data). Therefore, the research question in theapplication part focuses on the role of data analysis and the linguistic descrip-tion of patterns, which is related to tasks (1) and (3) in objective O. Generallyspeaking, the question is

R2: “How can unseen but interesting patterns in physiological sensor data beextracted, represented, and linguistically described?”

This can also be divided into more specific research questions as follows:

• How can data-driven approaches be applied in order 1) to extract distinc-tive time series patterns in different clinical settings, and 2) to mine thetemporal relations between multi-channels of physiological sensor data?

• How can a data-driven conceptual space be built as a semantic represen-tation of the time series patterns in order to infer linguistic characterisa-tions for the patterns?

• How can meaningful, interesting, and useful messages be generated innatural language for numerical patterns and their temporal relations us-ing the proposed semantic model?

1.4 Contributions

Given the general objective O, the core of the contributions in this thesis isthe notion of data-driven. This applies to both the research questions, which

1.4.CONTRIBUTIONS7

Asnotedintheproblemstatement,thesemanticrepresentationproblemdealswithasetofderivedinformation(perceivedorprocessedfeatures).There-fore,oneresearchquestionrelatedtothetheoreticalpartoftheresearchfocusesontheroleofsemanticrepresentationandlinguisticdescription,whichisre-latedtotasks(2)and(3)inobjectiveO.Generallyspeaking,thequestionis

R1:“Howcanperceivednumericalinformationbesemanticallyrepresentedanddescribedinalinguisticform?”

Thiscanbedividedintomorespecificresearchquestionsasfollows:

•Howcanasemanticrepresentationofanumericaldatasetautomaticallybecreatedinadata-drivenmanner,basedontheobservedexemplarsandtheirsemanticfeatures?

•Howcantheseobservationsbeconceptualisedwithintherepresentationconstructedinordertoformtheconceptsofthedataset’sdomain?

•Howcanthisrepresentationbeutilisedinordertoinferlinguisticchar-acterisationsforasetofnewunknownobservationsinnaturallanguage?

Likewise,withregardtotheproblemstatementabouttheapplicationpartofthisthesis,theframeworkisdedicatedtoturningrawnumericaldataintonaturallanguagetext.Morespecifically,however,itisconcernedwithconsid-eringreal-worlddatasetsfromthefieldofhealthcare(i.e.,timeseriespatternsderivedfromphysiologicalsensordata).Therefore,theresearchquestionintheapplicationpartfocusesontheroleofdataanalysisandthelinguisticdescrip-tionofpatterns,whichisrelatedtotasks(1)and(3)inobjectiveO.Generallyspeaking,thequestionis

R2:“Howcanunseenbutinterestingpatternsinphysiologicalsensordatabeextracted,represented,andlinguisticallydescribed?”

Thiscanalsobedividedintomorespecificresearchquestionsasfollows:

•Howcandata-drivenapproachesbeappliedinorder1)toextractdistinc-tivetimeseriespatternsindifferentclinicalsettings,and2)tominethetemporalrelationsbetweenmulti-channelsofphysiologicalsensordata?

•Howcanadata-drivenconceptualspacebebuiltasasemanticrepresen-tationofthetimeseriespatternsinordertoinferlinguisticcharacterisa-tionsforthepatterns?

•Howcanmeaningful,interesting,andusefulmessagesbegeneratedinnaturallanguagefornumericalpatternsandtheirtemporalrelationsus-ingtheproposedsemanticmodel?

1.4Contributions

GiventhegeneralobjectiveO,thecoreofthecontributionsinthisthesisisthenotionofdata-driven.Thisappliestoboththeresearchquestions,which


are concerned with semantically representing the perceived information, andnumerically extracting interesting sensor patterns.

The contributions of this thesis are organised into four items. The first twocontributions C1 and C2 (which address the research question R1), presentan automatic construction of a conceptual space from numerical data, whichcan then be used to infer the semantic interpretation of novel unknown obser-vations. The other two contributions C3 and C4 (which address the researchquestion R2), present the automatic ways of extracting prototypical time seriespatterns from various physiological sensor data using temporal rule mining. Inaddition, they apply various linguistic description methods (including the pro-posed conceptual spaces) in order to generate text for such time series patterns.

C1: Data-driven construction of conceptual spaces:

In the theoretical part of this work, a data-driven approach is proposedin order to automatically construct conceptual spaces and perform con-cept formation based on the input exemplars of a numerical data set. In-stead of initialising the domains and dimensions of the conceptual spacefrom a priori knowledge or forming concepts in a rule-based manner, theproposed data-driven process automatically determines the relevant do-mains and dimensions on the basis of its ability to distinguish betweenexemplars of different concepts. This determination is performed usingmachine learning (ML) techniques in order to identify the relevant fea-tures of the observed data with the primitive concepts. Furthermore, aninstance-based approach is presented for concept formation in which aconcept is formed within the conceptual space on the basis of the spatialrepresentation of its observed instances.

C2: Semantic inference in the conceptual spaces:

This thesis further proposes a semantic inference process for the concep-tual space, which is built in order to provide explainability for the novelunknown observations in natural language descriptions. By groundinga semantic representation, the conceptual space is employed in order toprovide a link between the numerical data and the semantic characterisa-tion of the concepts. For a new observation, its representation within theconceptual space specifies either its inclusion in the known concepts orits values with regard the quality dimensions. This set of information isthen mapped to a symbol space that infers the linguistic characterisationof the observation based on a set of concepts and features involved. Thiswork will demonstrate the utility of conceptual spaces as a solution tothe task of content determination within a natural language generation(NLG) system.

C3: Data-driven mining of prototypical patterns and temporal rules in phys-iological sensor data:


areconcernedwithsemanticallyrepresentingtheperceivedinformation,andnumericallyextractinginterestingsensorpatterns.

Thecontributionsofthisthesisareorganisedintofouritems.ThefirsttwocontributionsC1andC2(whichaddresstheresearchquestionR1),presentanautomaticconstructionofaconceptualspacefromnumericaldata,whichcanthenbeusedtoinferthesemanticinterpretationofnovelunknownobser-vations.TheothertwocontributionsC3andC4(whichaddresstheresearchquestionR2),presenttheautomaticwaysofextractingprototypicaltimeseriespatternsfromvariousphysiologicalsensordatausingtemporalrulemining.Inaddition,theyapplyvariouslinguisticdescriptionmethods(includingthepro-posedconceptualspaces)inordertogeneratetextforsuchtimeseriespatterns.

C1:Data-drivenconstructionofconceptualspaces:

Inthetheoreticalpartofthiswork,adata-drivenapproachisproposedinordertoautomaticallyconstructconceptualspacesandperformcon-ceptformationbasedontheinputexemplarsofanumericaldataset.In-steadofinitialisingthedomainsanddimensionsoftheconceptualspacefromaprioriknowledgeorformingconceptsinarule-basedmanner,theproposeddata-drivenprocessautomaticallydeterminestherelevantdo-mainsanddimensionsonthebasisofitsabilitytodistinguishbetweenexemplarsofdifferentconcepts.Thisdeterminationisperformedusingmachinelearning(ML)techniquesinordertoidentifytherelevantfea-turesoftheobserveddatawiththeprimitiveconcepts.Furthermore,aninstance-basedapproachispresentedforconceptformationinwhichaconceptisformedwithintheconceptualspaceonthebasisofthespatialrepresentationofitsobservedinstances.

C2:Semanticinferenceintheconceptualspaces:

Thisthesisfurtherproposesasemanticinferenceprocessfortheconcep-tualspace,whichisbuiltinordertoprovideexplainabilityforthenovelunknownobservationsinnaturallanguagedescriptions.Bygroundingasemanticrepresentation,theconceptualspaceisemployedinordertoprovidealinkbetweenthenumericaldataandthesemanticcharacterisa-tionoftheconcepts.Foranewobservation,itsrepresentationwithintheconceptualspacespecifieseitheritsinclusionintheknownconceptsoritsvalueswithregardthequalitydimensions.Thissetofinformationisthenmappedtoasymbolspacethatinfersthelinguisticcharacterisationoftheobservationbasedonasetofconceptsandfeaturesinvolved.Thisworkwilldemonstratetheutilityofconceptualspacesasasolutiontothetaskofcontentdeterminationwithinanaturallanguagegeneration(NLG)system.

C3:Data-drivenminingofprototypicalpatternsandtemporalrulesinphys-iologicalsensordata:


are concerned with semantically representing the perceived information, andnumerically extracting interesting sensor patterns.

The contributions of this thesis are organised into four items. The first twocontributions C1 and C2 (which address the research question R1), presentan automatic construction of a conceptual space from numerical data, whichcan then be used to infer the semantic interpretation of novel unknown obser-vations. The other two contributions C3 and C4 (which address the researchquestion R2), present the automatic ways of extracting prototypical time seriespatterns from various physiological sensor data using temporal rule mining. Inaddition, they apply various linguistic description methods (including the pro-posed conceptual spaces) in order to generate text for such time series patterns.

C1: Data-driven construction of conceptual spaces:

In the theoretical part of this work, a data-driven approach is proposedin order to automatically construct conceptual spaces and perform con-cept formation based on the input exemplars of a numerical data set. In-stead of initialising the domains and dimensions of the conceptual spacefrom a priori knowledge or forming concepts in a rule-based manner, theproposed data-driven process automatically determines the relevant do-mains and dimensions on the basis of its ability to distinguish betweenexemplars of different concepts. This determination is performed usingmachine learning (ML) techniques in order to identify the relevant fea-tures of the observed data with the primitive concepts. Furthermore, aninstance-based approach is presented for concept formation in which aconcept is formed within the conceptual space on the basis of the spatialrepresentation of its observed instances.

C2: Semantic inference in the conceptual spaces:

This thesis further proposes a semantic inference process for the concep-tual space, which is built in order to provide explainability for the novelunknown observations in natural language descriptions. By groundinga semantic representation, the conceptual space is employed in order toprovide a link between the numerical data and the semantic characterisa-tion of the concepts. For a new observation, its representation within theconceptual space specifies either its inclusion in the known concepts orits values with regard the quality dimensions. This set of information isthen mapped to a symbol space that infers the linguistic characterisationof the observation based on a set of concepts and features involved. Thiswork will demonstrate the utility of conceptual spaces as a solution tothe task of content determination within a natural language generation(NLG) system.

C3: Data-driven mining of prototypical patterns and temporal rules in phys-iological sensor data:


areconcernedwithsemanticallyrepresentingtheperceivedinformation,andnumericallyextractinginterestingsensorpatterns.

Thecontributionsofthisthesisareorganisedintofouritems.ThefirsttwocontributionsC1andC2(whichaddresstheresearchquestionR1),presentanautomaticconstructionofaconceptualspacefromnumericaldata,whichcanthenbeusedtoinferthesemanticinterpretationofnovelunknownobser-vations.TheothertwocontributionsC3andC4(whichaddresstheresearchquestionR2),presenttheautomaticwaysofextractingprototypicaltimeseriespatternsfromvariousphysiologicalsensordatausingtemporalrulemining.Inaddition,theyapplyvariouslinguisticdescriptionmethods(includingthepro-posedconceptualspaces)inordertogeneratetextforsuchtimeseriespatterns.

C1:Data-drivenconstructionofconceptualspaces:

Inthetheoreticalpartofthiswork,adata-drivenapproachisproposedinordertoautomaticallyconstructconceptualspacesandperformcon-ceptformationbasedontheinputexemplarsofanumericaldataset.In-steadofinitialisingthedomainsanddimensionsoftheconceptualspacefromaprioriknowledgeorformingconceptsinarule-basedmanner,theproposeddata-drivenprocessautomaticallydeterminestherelevantdo-mainsanddimensionsonthebasisofitsabilitytodistinguishbetweenexemplarsofdifferentconcepts.Thisdeterminationisperformedusingmachinelearning(ML)techniquesinordertoidentifytherelevantfea-turesoftheobserveddatawiththeprimitiveconcepts.Furthermore,aninstance-basedapproachispresentedforconceptformationinwhichaconceptisformedwithintheconceptualspaceonthebasisofthespatialrepresentationofitsobservedinstances.

C2:Semanticinferenceintheconceptualspaces:

Thisthesisfurtherproposesasemanticinferenceprocessfortheconcep-tualspace,whichisbuiltinordertoprovideexplainabilityforthenovelunknownobservationsinnaturallanguagedescriptions.Bygroundingasemanticrepresentation,theconceptualspaceisemployedinordertoprovidealinkbetweenthenumericaldataandthesemanticcharacterisa-tionoftheconcepts.Foranewobservation,itsrepresentationwithintheconceptualspacespecifieseitheritsinclusionintheknownconceptsoritsvalueswithregardthequalitydimensions.Thissetofinformationisthenmappedtoasymbolspacethatinfersthelinguisticcharacterisationoftheobservationbasedonasetofconceptsandfeaturesinvolved.Thisworkwilldemonstratetheutilityofconceptualspacesasasolutiontothetaskofcontentdeterminationwithinanaturallanguagegeneration(NLG)system.

C3:Data-drivenminingofprototypicalpatternsandtemporalrulesinphys-iologicalsensordata:

1.5. THESIS OUTLINE 9

With regard to the task of data analysis for raw sensor data, this workfocuses on data-driven approaches in order 1) to extract the prototypicalpatterns that frequently occur in the recorded signals of various healthparameters such as heart rate, respiration rate, blood pressure, etc., and2) to mine the temporal rules of the extracted patterns in order to findinteresting relations between them. This data analysis is performed usingunsupervised data mining techniques to discover unseen (but worthy ofextraction) information beyond the expert knowledge. Moreover, the in-vestigation of the temporal rule mining leads to an interesting outcomeregarding the uniqueness of the extracted rules for different clinical con-ditions. This study shows the distinctive co-occurrence of prototypicalpatterns for each clinical condition.

C4: Linguistic description of time series patterns using semantic representa-tions:

Given that the aim of the framework is to generate natural language textfor physiological data, the processed time series data in the form of pat-terns are then used as input for the linguistic description approaches. Inaddition to the template-based methods for directly generating text fora pattern using stored features, one contribution of the work is to ap-ply the proposed semantic representation in order to automatically buildthe conceptual space of the time series pattern into a physiological sensordata set. This conceptual space is then utilised to infer linguistic charac-terisations for such numerical data. An empirical evaluation of the outputtext for the time series pattern demonstrates the advantages of developingsuch a conceptual space as a content determination solution for an NLGsystem.

The intersection of the contributions is the idea of proposing data-driven ap-proaches that are able to present new aspects of the sensor data (both numeri-cally and symbolically) in a data set in which the observations are not catchableby knowledge-driven approaches, are nevertheless worth being 1) numericallyextracted, 2) semantically represented, and 3) linguistically described.

1.5 Thesis Outline

This dissertation is composed of two parts, which consider the division of theproblem statement according to the theoretical and the application focuses. Thefirst part presents the theoretical focus which is the task of semantic represen-tation, while the second part presents the application focus which is the tasksof numerically analysing and generating linguistic descriptions for time seriespatterns.

1.5.THESISOUTLINE9

Withregardtothetaskofdataanalysisforrawsensordata,thisworkfocusesondata-drivenapproachesinorder1)toextracttheprototypicalpatternsthatfrequentlyoccurintherecordedsignalsofvarioushealthparameterssuchasheartrate,respirationrate,bloodpressure,etc.,and2)tominethetemporalrulesoftheextractedpatternsinordertofindinterestingrelationsbetweenthem.Thisdataanalysisisperformedusingunsuperviseddataminingtechniquestodiscoverunseen(butworthyofextraction)informationbeyondtheexpertknowledge.Moreover,thein-vestigationofthetemporalruleminingleadstoaninterestingoutcomeregardingtheuniquenessoftheextractedrulesfordifferentclinicalcon-ditions.Thisstudyshowsthedistinctiveco-occurrenceofprototypicalpatternsforeachclinicalcondition.

C4:Linguisticdescriptionoftimeseriespatternsusingsemanticrepresenta-tions:

Giventhattheaimoftheframeworkistogeneratenaturallanguagetextforphysiologicaldata,theprocessedtimeseriesdataintheformofpat-ternsarethenusedasinputforthelinguisticdescriptionapproaches.Inadditiontothetemplate-basedmethodsfordirectlygeneratingtextforapatternusingstoredfeatures,onecontributionoftheworkistoap-plytheproposedsemanticrepresentationinordertoautomaticallybuildtheconceptualspaceofthetimeseriespatternintoaphysiologicalsensordataset.Thisconceptualspaceisthenutilisedtoinferlinguisticcharac-terisationsforsuchnumericaldata.AnempiricalevaluationoftheoutputtextforthetimeseriespatterndemonstratestheadvantagesofdevelopingsuchaconceptualspaceasacontentdeterminationsolutionforanNLGsystem.

Theintersectionofthecontributionsistheideaofproposingdata-drivenap-proachesthatareabletopresentnewaspectsofthesensordata(bothnumeri-callyandsymbolically)inadatasetinwhichtheobservationsarenotcatchablebyknowledge-drivenapproaches,areneverthelessworthbeing1)numericallyextracted,2)semanticallyrepresented,and3)linguisticallydescribed.

1.5ThesisOutline

Thisdissertationiscomposedoftwoparts,whichconsiderthedivisionoftheproblemstatementaccordingtothetheoreticalandtheapplicationfocuses.Thefirstpartpresentsthetheoreticalfocuswhichisthetaskofsemanticrepresen-tation,whilethesecondpartpresentstheapplicationfocuswhichisthetasksofnumericallyanalysingandgeneratinglinguisticdescriptionsfortimeseriespatterns.


With regard to the task of data analysis for raw sensor data, this workfocuses on data-driven approaches in order 1) to extract the prototypicalpatterns that frequently occur in the recorded signals of various healthparameters such as heart rate, respiration rate, blood pressure, etc., and2) to mine the temporal rules of the extracted patterns in order to findinteresting relations between them. This data analysis is performed usingunsupervised data mining techniques to discover unseen (but worthy ofextraction) information beyond the expert knowledge. Moreover, the in-vestigation of the temporal rule mining leads to an interesting outcomeregarding the uniqueness of the extracted rules for different clinical con-ditions. This study shows the distinctive co-occurrence of prototypicalpatterns for each clinical condition.

C4: Linguistic description of time series patterns using semantic representa-tions:

Given that the aim of the framework is to generate natural language textfor physiological data, the processed time series data in the form of pat-terns are then used as input for the linguistic description approaches. Inaddition to the template-based methods for directly generating text fora pattern using stored features, one contribution of the work is to ap-ply the proposed semantic representation in order to automatically buildthe conceptual space of the time series pattern into a physiological sensordata set. This conceptual space is then utilised to infer linguistic charac-terisations for such numerical data. An empirical evaluation of the outputtext for the time series pattern demonstrates the advantages of developingsuch a conceptual space as a content determination solution for an NLGsystem.

The intersection of the contributions is the idea of proposing data-driven ap-proaches that are able to present new aspects of the sensor data (both numeri-cally and symbolically) in a data set in which the observations are not catchableby knowledge-driven approaches, are nevertheless worth being 1) numericallyextracted, 2) semantically represented, and 3) linguistically described.

1.5 Thesis Outline

This dissertation is composed of two parts, which consider the division of theproblem statement according to the theoretical and the application focuses. Thefirst part presents the theoretical focus which is the task of semantic represen-tation, while the second part presents the application focus which is the tasksof numerically analysing and generating linguistic descriptions for time seriespatterns.

1.5.THESISOUTLINE9

Withregardtothetaskofdataanalysisforrawsensordata,thisworkfocusesondata-drivenapproachesinorder1)toextracttheprototypicalpatternsthatfrequentlyoccurintherecordedsignalsofvarioushealthparameterssuchasheartrate,respirationrate,bloodpressure,etc.,and2)tominethetemporalrulesoftheextractedpatternsinordertofindinterestingrelationsbetweenthem.Thisdataanalysisisperformedusingunsuperviseddataminingtechniquestodiscoverunseen(butworthyofextraction)informationbeyondtheexpertknowledge.Moreover,thein-vestigationofthetemporalruleminingleadstoaninterestingoutcomeregardingtheuniquenessoftheextractedrulesfordifferentclinicalcon-ditions.Thisstudyshowsthedistinctiveco-occurrenceofprototypicalpatternsforeachclinicalcondition.

C4:Linguisticdescriptionoftimeseriespatternsusingsemanticrepresenta-tions:

Giventhattheaimoftheframeworkistogeneratenaturallanguagetextforphysiologicaldata,theprocessedtimeseriesdataintheformofpat-ternsarethenusedasinputforthelinguisticdescriptionapproaches.Inadditiontothetemplate-basedmethodsfordirectlygeneratingtextforapatternusingstoredfeatures,onecontributionoftheworkistoap-plytheproposedsemanticrepresentationinordertoautomaticallybuildtheconceptualspaceofthetimeseriespatternintoaphysiologicalsensordataset.Thisconceptualspaceisthenutilisedtoinferlinguisticcharac-terisationsforsuchnumericaldata.AnempiricalevaluationoftheoutputtextforthetimeseriespatterndemonstratestheadvantagesofdevelopingsuchaconceptualspaceasacontentdeterminationsolutionforanNLGsystem.

Theintersectionofthecontributionsistheideaofproposingdata-drivenap-proachesthatareabletopresentnewaspectsofthesensordata(bothnumeri-callyandsymbolically)inadatasetinwhichtheobservationsarenotcatchablebyknowledge-drivenapproaches,areneverthelessworthbeing1)numericallyextracted,2)semanticallyrepresented,and3)linguisticallydescribed.

1.5ThesisOutline

Thisdissertationiscomposedoftwoparts,whichconsiderthedivisionoftheproblemstatementaccordingtothetheoreticalandtheapplicationfocuses.Thefirstpartpresentsthetheoreticalfocuswhichisthetaskofsemanticrepresen-tation,whilethesecondpartpresentstheapplicationfocuswhichisthetasksofnumericallyanalysingandgeneratinglinguisticdescriptionsfortimeseriespatterns.


Part I: Creating semantic representations for numerical data

The first part of this thesis presents the notion of data-driven conceptual spacesas a semantic representation tool for linguistically describing the numericaldata. After an overview of the background and the related work (Chapter 2),this part explains how a conceptual space can be constructed using the informa-tion observed (Chapter 3), before proceeding to explain how it can be utilisedto infer semantics for the unseen observations (Chapter 4). This process is thenbe assessed by applying the approach to a data set of leaf examples (Chapter 5).

Chapter 2 begins with a discussion about the notion of semantic representa-tions in the literature. It then presents the background to the theory ofconceptual spaces, together with related work on the role of this the-ory in AI. It also includes background and related work on the linguisticdescription and natural language generation approaches. Finally, a briefdiscussion summarises the need to apply conceptual spaces in order toperform some tasks in existing NLG solutions.

Chapter 3 proposes a data-driven approach in order to automatically constructconceptual spaces. It starts by defining the parameters of a numerical dataset. It then explains the steps involved in determining the quality dimen-sions and the domains of a conceptual space based on the most relevantfeatures of the data. Furthermore, the chapter includes an instance-basedalgorithm for concept representation in the domains of the derived space.

Chapter 4 presents the design of an inference approach in order to providea semantic characterisation of the novel unknown observations withinthe proposed conceptual space in Chapter 3. This chapter demonstratesthe steps involved in checking the inclusion of a new instance within thedomains. It then assigns the related linguistic terms using a defined sym-bol space based on the associated concepts and quality dimensions of theinstance. Finally, the chapter presents the steps for performing the realisa-tion task in order to turn linguistic terms into natural language messages.

Chapter 5 presents a case study that demonstrates the applicability of the ap-proach proposed in Chapters 3 and 4 using a data set of leaf images. Thischapter first describes the algorithms 1) to construct a conceptual spaceof leaves, 2) to represent each concept of leaves within the domains, and3) to generate linguistic descriptions for a set of unknown leaf examples.Finally, it describes an empirical evaluation method for measuring theappropriateness of the developed space by using the worthiness of themessages generated.


PartI:Creatingsemanticrepresentationsfornumericaldata

Thefirstpartofthisthesispresentsthenotionofdata-drivenconceptualspacesasasemanticrepresentationtoolforlinguisticallydescribingthenumericaldata.Afteranoverviewofthebackgroundandtherelatedwork(Chapter2),thispartexplainshowaconceptualspacecanbeconstructedusingtheinforma-tionobserved(Chapter3),beforeproceedingtoexplainhowitcanbeutilisedtoinfersemanticsfortheunseenobservations(Chapter4).Thisprocessisthenbeassessedbyapplyingtheapproachtoadatasetofleafexamples(Chapter5).

Chapter2beginswithadiscussionaboutthenotionofsemanticrepresenta-tionsintheliterature.Itthenpresentsthebackgroundtothetheoryofconceptualspaces,togetherwithrelatedworkontheroleofthisthe-oryinAI.Italsoincludesbackgroundandrelatedworkonthelinguisticdescriptionandnaturallanguagegenerationapproaches.Finally,abriefdiscussionsummarisestheneedtoapplyconceptualspacesinordertoperformsometasksinexistingNLGsolutions.

Chapter3proposesadata-drivenapproachinordertoautomaticallyconstructconceptualspaces.Itstartsbydefiningtheparametersofanumericaldataset.Itthenexplainsthestepsinvolvedindeterminingthequalitydimen-sionsandthedomainsofaconceptualspacebasedonthemostrelevantfeaturesofthedata.Furthermore,thechapterincludesaninstance-basedalgorithmforconceptrepresentationinthedomainsofthederivedspace.

Chapter4presentsthedesignofaninferenceapproachinordertoprovideasemanticcharacterisationofthenovelunknownobservationswithintheproposedconceptualspaceinChapter3.Thischapterdemonstratesthestepsinvolvedincheckingtheinclusionofanewinstancewithinthedomains.Itthenassignstherelatedlinguistictermsusingadefinedsym-bolspacebasedontheassociatedconceptsandqualitydimensionsoftheinstance.Finally,thechapterpresentsthestepsforperformingtherealisa-tiontaskinordertoturnlinguistictermsintonaturallanguagemessages.

Chapter5presentsacasestudythatdemonstratestheapplicabilityoftheap-proachproposedinChapters3and4usingadatasetofleafimages.Thischapterfirstdescribesthealgorithms1)toconstructaconceptualspaceofleaves,2)torepresenteachconceptofleaveswithinthedomains,and3)togeneratelinguisticdescriptionsforasetofunknownleafexamples.Finally,itdescribesanempiricalevaluationmethodformeasuringtheappropriatenessofthedevelopedspacebyusingtheworthinessofthemessagesgenerated.


Part I: Creating semantic representations for numerical data

The first part of this thesis presents the notion of data-driven conceptual spacesas a semantic representation tool for linguistically describing the numericaldata. After an overview of the background and the related work (Chapter 2),this part explains how a conceptual space can be constructed using the informa-tion observed (Chapter 3), before proceeding to explain how it can be utilisedto infer semantics for the unseen observations (Chapter 4). This process is thenbe assessed by applying the approach to a data set of leaf examples (Chapter 5).

Chapter 2 begins with a discussion about the notion of semantic representa-tions in the literature. It then presents the background to the theory ofconceptual spaces, together with related work on the role of this the-ory in AI. It also includes background and related work on the linguisticdescription and natural language generation approaches. Finally, a briefdiscussion summarises the need to apply conceptual spaces in order toperform some tasks in existing NLG solutions.

Chapter 3 proposes a data-driven approach in order to automatically constructconceptual spaces. It starts by defining the parameters of a numerical dataset. It then explains the steps involved in determining the quality dimen-sions and the domains of a conceptual space based on the most relevantfeatures of the data. Furthermore, the chapter includes an instance-basedalgorithm for concept representation in the domains of the derived space.

Chapter 4 presents the design of an inference approach in order to providea semantic characterisation of the novel unknown observations withinthe proposed conceptual space in Chapter 3. This chapter demonstratesthe steps involved in checking the inclusion of a new instance within thedomains. It then assigns the related linguistic terms using a defined sym-bol space based on the associated concepts and quality dimensions of theinstance. Finally, the chapter presents the steps for performing the realisa-tion task in order to turn linguistic terms into natural language messages.

Chapter 5 presents a case study that demonstrates the applicability of the ap-proach proposed in Chapters 3 and 4 using a data set of leaf images. Thischapter first describes the algorithms 1) to construct a conceptual spaceof leaves, 2) to represent each concept of leaves within the domains, and3) to generate linguistic descriptions for a set of unknown leaf examples.Finally, it describes an empirical evaluation method for measuring theappropriateness of the developed space by using the worthiness of themessages generated.


PartI:Creatingsemanticrepresentationsfornumericaldata

Thefirstpartofthisthesispresentsthenotionofdata-drivenconceptualspacesasasemanticrepresentationtoolforlinguisticallydescribingthenumericaldata.Afteranoverviewofthebackgroundandtherelatedwork(Chapter2),thispartexplainshowaconceptualspacecanbeconstructedusingtheinforma-tionobserved(Chapter3),beforeproceedingtoexplainhowitcanbeutilisedtoinfersemanticsfortheunseenobservations(Chapter4).Thisprocessisthenbeassessedbyapplyingtheapproachtoadatasetofleafexamples(Chapter5).

Chapter2beginswithadiscussionaboutthenotionofsemanticrepresenta-tionsintheliterature.Itthenpresentsthebackgroundtothetheoryofconceptualspaces,togetherwithrelatedworkontheroleofthisthe-oryinAI.Italsoincludesbackgroundandrelatedworkonthelinguisticdescriptionandnaturallanguagegenerationapproaches.Finally,abriefdiscussionsummarisestheneedtoapplyconceptualspacesinordertoperformsometasksinexistingNLGsolutions.

Chapter3proposesadata-drivenapproachinordertoautomaticallyconstructconceptualspaces.Itstartsbydefiningtheparametersofanumericaldataset.Itthenexplainsthestepsinvolvedindeterminingthequalitydimen-sionsandthedomainsofaconceptualspacebasedonthemostrelevantfeaturesofthedata.Furthermore,thechapterincludesaninstance-basedalgorithmforconceptrepresentationinthedomainsofthederivedspace.

Chapter4presentsthedesignofaninferenceapproachinordertoprovideasemanticcharacterisationofthenovelunknownobservationswithintheproposedconceptualspaceinChapter3.Thischapterdemonstratesthestepsinvolvedincheckingtheinclusionofanewinstancewithinthedomains.Itthenassignstherelatedlinguistictermsusingadefinedsym-bolspacebasedontheassociatedconceptsandqualitydimensionsoftheinstance.Finally,thechapterpresentsthestepsforperformingtherealisa-tiontaskinordertoturnlinguistictermsintonaturallanguagemessages.

Chapter5presentsacasestudythatdemonstratestheapplicabilityoftheap-proachproposedinChapters3and4usingadatasetofleafimages.Thischapterfirstdescribesthealgorithms1)toconstructaconceptualspaceofleaves,2)torepresenteachconceptofleaveswithinthedomains,and3)togeneratelinguisticdescriptionsforasetofunknownleafexamples.Finally,itdescribesanempiricalevaluationmethodformeasuringtheappropriatenessofthedevelopedspacebyusingtheworthinessofthemessagesgenerated.


Part II: Physiological sensor data: From data analysis to

linguistic descriptions

The second part of the thesis focuses on the entire framework of describing thenumerical data using the semantic model proposed in Part I, but specifically forphysiological sensor data. A series of data analysis approaches are investigatedin order to prepare suitable information (i.e., time series patterns) that is fed tothe semantic representation. Chapters 6 to 8 are largely dedicated to this task.First, The current state of the art for the data mining approaches on sensor datais discussed (Chapter 6). Then, after exacting unseen but interesting time seriespatterns (Chapter 7) and the temporal rules between the patterns (Chapter 8),it is demonstrated how the proposed semantic model in Part I can be appliedto linguistically describe these patterns in such a framework (Chapter 9).

Chapter 6 provides a survey of work on existing data mining approaches inorder to analyse wearable sensors in health monitoring systems. It alsoconsiders the new trends in the field and the current challenges to dataanalysis in the healthcare domain.

Chapter 7 introduces the processes of collecting and acquiring the input physi-ological sensor data sets that were used in this work. It then describes thevarious unsupervised approaches first used to detect partial trends and toextract prototypical patterns in different channels of physiological timeseries data.

Chapter 8 presents a novel modified temporal rule mining approach to dis-covering the co-occurrence of prototypical patterns among various timeseries data of vital signs in a clinical condition. This chapter further de-scribes the algorithms for comparing the extracted rules in order to showthe uniqueness of the rules for different clinical conditions. Another cen-tral aspect of this chapter is the presentation of a template-based naturallanguage generation method for describing temporal rules of patterns innatural language.

Chapter 9 presents the process of applying the proposed semantic representa-tion in Part I in order to automatically construct the conceptual space ofthe abstracted physiological patterns. It then describes the process of in-ferring semantic descriptions for a set of unknown patterns. Furthermore,this chapter presents an evaluation to demonstrate the appropriateness ofthe conceptual space of the patterns used for text generation, togetherwith a comparison between the output text of the physiological patternfrom the semantic model and the output text of the template-based ap-proaches.

1.5.THESISOUTLINE11

PartII:Physiologicalsensordata:Fromdataanalysisto

linguisticdescriptions

ThesecondpartofthethesisfocusesontheentireframeworkofdescribingthenumericaldatausingthesemanticmodelproposedinPartI,butspecificallyforphysiologicalsensordata.Aseriesofdataanalysisapproachesareinvestigatedinordertopreparesuitableinformation(i.e.,timeseriespatterns)thatisfedtothesemanticrepresentation.Chapters6to8arelargelydedicatedtothistask.First,Thecurrentstateoftheartforthedataminingapproachesonsensordataisdiscussed(Chapter6).Then,afterexactingunseenbutinterestingtimeseriespatterns(Chapter7)andthetemporalrulesbetweenthepatterns(Chapter8),itisdemonstratedhowtheproposedsemanticmodelinPartIcanbeappliedtolinguisticallydescribethesepatternsinsuchaframework(Chapter9).

Chapter6providesasurveyofworkonexistingdataminingapproachesinordertoanalysewearablesensorsinhealthmonitoringsystems.Italsoconsidersthenewtrendsinthefieldandthecurrentchallengestodataanalysisinthehealthcaredomain.

Chapter7introducestheprocessesofcollectingandacquiringtheinputphysi-ologicalsensordatasetsthatwereusedinthiswork.Itthendescribesthevariousunsupervisedapproachesfirstusedtodetectpartialtrendsandtoextractprototypicalpatternsindifferentchannelsofphysiologicaltimeseriesdata.

Chapter8presentsanovelmodifiedtemporalruleminingapproachtodis-coveringtheco-occurrenceofprototypicalpatternsamongvarioustimeseriesdataofvitalsignsinaclinicalcondition.Thischapterfurtherde-scribesthealgorithmsforcomparingtheextractedrulesinordertoshowtheuniquenessoftherulesfordifferentclinicalconditions.Anothercen-tralaspectofthischapteristhepresentationofatemplate-basednaturallanguagegenerationmethodfordescribingtemporalrulesofpatternsinnaturallanguage.

Chapter9presentstheprocessofapplyingtheproposedsemanticrepresenta-tioninPartIinordertoautomaticallyconstructtheconceptualspaceoftheabstractedphysiologicalpatterns.Itthendescribestheprocessofin-ferringsemanticdescriptionsforasetofunknownpatterns.Furthermore,thischapterpresentsanevaluationtodemonstratetheappropriatenessoftheconceptualspaceofthepatternsusedfortextgeneration,togetherwithacomparisonbetweentheoutputtextofthephysiologicalpatternfromthesemanticmodelandtheoutputtextofthetemplate-basedap-proaches.


Part II: Physiological sensor data: From data analysis to

linguistic descriptions

The second part of the thesis focuses on the entire framework of describing thenumerical data using the semantic model proposed in Part I, but specifically forphysiological sensor data. A series of data analysis approaches are investigatedin order to prepare suitable information (i.e., time series patterns) that is fed tothe semantic representation. Chapters 6 to 8 are largely dedicated to this task.First, The current state of the art for the data mining approaches on sensor datais discussed (Chapter 6). Then, after exacting unseen but interesting time seriespatterns (Chapter 7) and the temporal rules between the patterns (Chapter 8),it is demonstrated how the proposed semantic model in Part I can be appliedto linguistically describe these patterns in such a framework (Chapter 9).

Chapter 6 provides a survey of work on existing data mining approaches inorder to analyse wearable sensors in health monitoring systems. It alsoconsiders the new trends in the field and the current challenges to dataanalysis in the healthcare domain.

Chapter 7 introduces the processes of collecting and acquiring the input physi-ological sensor data sets that were used in this work. It then describes thevarious unsupervised approaches first used to detect partial trends and toextract prototypical patterns in different channels of physiological timeseries data.

Chapter 8 presents a novel modified temporal rule mining approach to dis-covering the co-occurrence of prototypical patterns among various timeseries data of vital signs in a clinical condition. This chapter further de-scribes the algorithms for comparing the extracted rules in order to showthe uniqueness of the rules for different clinical conditions. Another cen-tral aspect of this chapter is the presentation of a template-based naturallanguage generation method for describing temporal rules of patterns innatural language.

Chapter 9 presents the process of applying the proposed semantic representa-tion in Part I in order to automatically construct the conceptual space ofthe abstracted physiological patterns. It then describes the process of in-ferring semantic descriptions for a set of unknown patterns. Furthermore,this chapter presents an evaluation to demonstrate the appropriateness ofthe conceptual space of the patterns used for text generation, togetherwith a comparison between the output text of the physiological patternfrom the semantic model and the output text of the template-based ap-proaches.

1.5.THESISOUTLINE11

PartII:Physiologicalsensordata:Fromdataanalysisto

linguisticdescriptions

ThesecondpartofthethesisfocusesontheentireframeworkofdescribingthenumericaldatausingthesemanticmodelproposedinPartI,butspecificallyforphysiologicalsensordata.Aseriesofdataanalysisapproachesareinvestigatedinordertopreparesuitableinformation(i.e.,timeseriespatterns)thatisfedtothesemanticrepresentation.Chapters6to8arelargelydedicatedtothistask.First,Thecurrentstateoftheartforthedataminingapproachesonsensordataisdiscussed(Chapter6).Then,afterexactingunseenbutinterestingtimeseriespatterns(Chapter7)andthetemporalrulesbetweenthepatterns(Chapter8),itisdemonstratedhowtheproposedsemanticmodelinPartIcanbeappliedtolinguisticallydescribethesepatternsinsuchaframework(Chapter9).

Chapter6providesasurveyofworkonexistingdataminingapproachesinordertoanalysewearablesensorsinhealthmonitoringsystems.Italsoconsidersthenewtrendsinthefieldandthecurrentchallengestodataanalysisinthehealthcaredomain.

Chapter7introducestheprocessesofcollectingandacquiringtheinputphysi-ologicalsensordatasetsthatwereusedinthiswork.Itthendescribesthevariousunsupervisedapproachesfirstusedtodetectpartialtrendsandtoextractprototypicalpatternsindifferentchannelsofphysiologicaltimeseriesdata.

Chapter8presentsanovelmodifiedtemporalruleminingapproachtodis-coveringtheco-occurrenceofprototypicalpatternsamongvarioustimeseriesdataofvitalsignsinaclinicalcondition.Thischapterfurtherde-scribesthealgorithmsforcomparingtheextractedrulesinordertoshowtheuniquenessoftherulesfordifferentclinicalconditions.Anothercen-tralaspectofthischapteristhepresentationofatemplate-basednaturallanguagegenerationmethodfordescribingtemporalrulesofpatternsinnaturallanguage.

Chapter9presentstheprocessofapplyingtheproposedsemanticrepresenta-tioninPartIinordertoautomaticallyconstructtheconceptualspaceoftheabstractedphysiologicalpatterns.Itthendescribestheprocessofin-ferringsemanticdescriptionsforasetofunknownpatterns.Furthermore,thischapterpresentsanevaluationtodemonstratetheappropriatenessoftheconceptualspaceofthepatternsusedfortextgeneration,togetherwithacomparisonbetweentheoutputtextofthephysiologicalpatternfromthesemanticmodelandtheoutputtextofthetemplate-basedap-proaches.


See Figure 1.3 for the MindMap of the thesis, which illustrates the appearanceof the research tasks in the chapters, together with the thesis’ contributions.After presenting parts I and II,

Chapter 10 concludes the thesis by presenting a summary of the contributions,a discussion of the limitations of the proposed semantic representationand application framework, and finally an overview of the possible direc-tions for the future work.

1.6 Publications

The contributions presented in this thesis have been published in the followingjournal and conference papers:

• Hadi Banaee, Erik Schaffernicht, and Amy Loutfi. Data-driven concep-tual spaces: Creating semantic representations for linguistic descriptionsof numerical data. Journal of Artificial Intelligence Research, submitted,2018.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Data mining forwearable sensors in health monitoring systems: a review of recent trendsand challenges. Sensors, 13(12):17472–17500, 2013.

• Hadi Banaee and Amy Loutfi. Data-driven rule mining and representa-tion of temporal patterns in physiological sensor data. IEEE journal ofbiomedical and health informatics, 19(5):1557–1566, 2015.

• Hadi Banaee and Amy Loutfi. Using conceptual spaces to model domainknowledge in data-to-text systems. In 8th International Natural Lan-guage Generation (INLG) Conference, Philadelphia, USA, pages 11–15,Association for Computational Linguistics, 2014.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Descriptive mod-elling of clinical conditions with data-driven rule mining in physiologicaldata. In 8th International Conference on Health Informatics (HEALTH-INF), Lisbon, Portugal, pages 103–113, 2015.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. A frameworkfor automatic text generation of trends in physiological time series data.In IEEE International Conference on Systems, Man, and Cybernetics(SMC), Manchester, UK, pages 3876–3881. IEEE, 2013.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Towards NLG forphysiological data monitoring with body area networks. 14th EuropeanWorkshop on Natural Language Generation (ENLG), Sofia, Bulgaria,pages 193–197, Association for Computational Linguistics, 2013.


SeeFigure1.3fortheMindMapofthethesis,whichillustratestheappearanceoftheresearchtasksinthechapters,togetherwiththethesis’contributions.AfterpresentingpartsIandII,

Chapter10concludesthethesisbypresentingasummaryofthecontributions,adiscussionofthelimitationsoftheproposedsemanticrepresentationandapplicationframework,andfinallyanoverviewofthepossibledirec-tionsforthefuturework.

1.6Publications

Thecontributionspresentedinthisthesishavebeenpublishedinthefollowingjournalandconferencepapers:

•HadiBanaee,ErikSchaffernicht,andAmyLoutfi.Data-drivenconcep-tualspaces:Creatingsemanticrepresentationsforlinguisticdescriptionsofnumericaldata.JournalofArtificialIntelligenceResearch,submitted,2018.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Dataminingforwearablesensorsinhealthmonitoringsystems:areviewofrecenttrendsandchallenges.Sensors,13(12):17472–17500,2013.

•HadiBanaeeandAmyLoutfi.Data-drivenruleminingandrepresenta-tionoftemporalpatternsinphysiologicalsensordata.IEEEjournalofbiomedicalandhealthinformatics,19(5):1557–1566,2015.

•HadiBanaeeandAmyLoutfi.Usingconceptualspacestomodeldomainknowledgeindata-to-textsystems.In8thInternationalNaturalLan-guageGeneration(INLG)Conference,Philadelphia,USA,pages11–15,AssociationforComputationalLinguistics,2014.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Descriptivemod-ellingofclinicalconditionswithdata-drivenrulemininginphysiologicaldata.In8thInternationalConferenceonHealthInformatics(HEALTH-INF),Lisbon,Portugal,pages103–113,2015.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Aframeworkforautomatictextgenerationoftrendsinphysiologicaltimeseriesdata.InIEEEInternationalConferenceonSystems,Man,andCybernetics(SMC),Manchester,UK,pages3876–3881.IEEE,2013.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.TowardsNLGforphysiologicaldatamonitoringwithbodyareanetworks.14thEuropeanWorkshoponNaturalLanguageGeneration(ENLG),Sofia,Bulgaria,pages193–197,AssociationforComputationalLinguistics,2013.


See Figure 1.3 for the MindMap of the thesis, which illustrates the appearanceof the research tasks in the chapters, together with the thesis’ contributions.After presenting parts I and II,

Chapter 10 concludes the thesis by presenting a summary of the contributions,a discussion of the limitations of the proposed semantic representationand application framework, and finally an overview of the possible direc-tions for the future work.

1.6 Publications

The contributions presented in this thesis have been published in the followingjournal and conference papers:

• Hadi Banaee, Erik Schaffernicht, and Amy Loutfi. Data-driven concep-tual spaces: Creating semantic representations for linguistic descriptionsof numerical data. Journal of Artificial Intelligence Research, submitted,2018.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Data mining forwearable sensors in health monitoring systems: a review of recent trendsand challenges. Sensors, 13(12):17472–17500, 2013.

• Hadi Banaee and Amy Loutfi. Data-driven rule mining and representa-tion of temporal patterns in physiological sensor data. IEEE journal ofbiomedical and health informatics, 19(5):1557–1566, 2015.

• Hadi Banaee and Amy Loutfi. Using conceptual spaces to model domainknowledge in data-to-text systems. In 8th International Natural Lan-guage Generation (INLG) Conference, Philadelphia, USA, pages 11–15,Association for Computational Linguistics, 2014.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Descriptive mod-elling of clinical conditions with data-driven rule mining in physiologicaldata. In 8th International Conference on Health Informatics (HEALTH-INF), Lisbon, Portugal, pages 103–113, 2015.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. A frameworkfor automatic text generation of trends in physiological time series data.In IEEE International Conference on Systems, Man, and Cybernetics(SMC), Manchester, UK, pages 3876–3881. IEEE, 2013.

• Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Towards NLG forphysiological data monitoring with body area networks. 14th EuropeanWorkshop on Natural Language Generation (ENLG), Sofia, Bulgaria,pages 193–197, Association for Computational Linguistics, 2013.


SeeFigure1.3fortheMindMapofthethesis,whichillustratestheappearanceoftheresearchtasksinthechapters,togetherwiththethesis’contributions.AfterpresentingpartsIandII,

Chapter10concludesthethesisbypresentingasummaryofthecontributions,adiscussionofthelimitationsoftheproposedsemanticrepresentationandapplicationframework,andfinallyanoverviewofthepossibledirec-tionsforthefuturework.

1.6Publications

Thecontributionspresentedinthisthesishavebeenpublishedinthefollowingjournalandconferencepapers:

•HadiBanaee,ErikSchaffernicht,andAmyLoutfi.Data-drivenconcep-tualspaces:Creatingsemanticrepresentationsforlinguisticdescriptionsofnumericaldata.JournalofArtificialIntelligenceResearch,submitted,2018.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Dataminingforwearablesensorsinhealthmonitoringsystems:areviewofrecenttrendsandchallenges.Sensors,13(12):17472–17500,2013.

•HadiBanaeeandAmyLoutfi.Data-drivenruleminingandrepresenta-tionoftemporalpatternsinphysiologicalsensordata.IEEEjournalofbiomedicalandhealthinformatics,19(5):1557–1566,2015.

•HadiBanaeeandAmyLoutfi.Usingconceptualspacestomodeldomainknowledgeindata-to-textsystems.In8thInternationalNaturalLan-guageGeneration(INLG)Conference,Philadelphia,USA,pages11–15,AssociationforComputationalLinguistics,2014.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Descriptivemod-ellingofclinicalconditionswithdata-drivenrulemininginphysiologicaldata.In8thInternationalConferenceonHealthInformatics(HEALTH-INF),Lisbon,Portugal,pages103–113,2015.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Aframeworkforautomatictextgenerationoftrendsinphysiologicaltimeseriesdata.InIEEEInternationalConferenceonSystems,Man,andCybernetics(SMC),Manchester,UK,pages3876–3881.IEEE,2013.

•HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.TowardsNLGforphysiologicaldatamonitoringwithbodyareanetworks.14thEuropeanWorkshoponNaturalLanguageGeneration(ENLG),Sofia,Bulgaria,pages193–197,AssociationforComputationalLinguistics,2013.

1.6. PUBLICATIONS 13

The following publications appeared as additional results of this research, butare not included in this thesis:

• Mobyen Uddin Ahmed, Hadi Banaee, and Amy Loutfi. Health moni-toring for elderly: An application using case-based reasoning and clusteranalysis. ISRN Artificial Intelligence, vol. 2013, 2013.

• Hadi Banaee and Amy Loutfi. What I talk about when I talk about data:Descriptive modelling of data in data-to-text systems. In 1st InternationalWorkshop on Data-to-Text Generation, Edinburgh, UK, 2015.

• Mobyen Uddin Ahmed, Jesica Rivero Espinosa, Alenka Reissner, ÀlexDomingo, Hadi Banaee, Amy Loutfi, and Xavier Rafael-Palou. Self-serveict-based health monitoring to support active ageing. In 8th Interna-tional Conference on Health Informatics (HEALTHINF), Lisbon, Portu-gal, 2015.

• Mobyen Uddin Ahmed, Hadi Banaee, Xavier Rafael-Palou, and AmyLoutfi. Intelligent healthcare services to support health monitoring ofelderly. In Internet of things. user-centric IoT, pp. 178-186. Springer,Cham, 2015.

1.6.PUBLICATIONS13

Thefollowingpublicationsappearedasadditionalresultsofthisresearch,butarenotincludedinthisthesis:

•MobyenUddinAhmed,HadiBanaee,andAmyLoutfi.Healthmoni-toringforelderly:Anapplicationusingcase-basedreasoningandclusteranalysis.ISRNArtificialIntelligence,vol.2013,2013.

•HadiBanaeeandAmyLoutfi.WhatItalkaboutwhenItalkaboutdata:Descriptivemodellingofdataindata-to-textsystems.In1stInternationalWorkshoponData-to-TextGeneration,Edinburgh,UK,2015.

•MobyenUddinAhmed,JesicaRiveroEspinosa,AlenkaReissner,ÀlexDomingo,HadiBanaee,AmyLoutfi,andXavierRafael-Palou.Self-serveict-basedhealthmonitoringtosupportactiveageing.In8thInterna-tionalConferenceonHealthInformatics(HEALTHINF),Lisbon,Portu-gal,2015.

•MobyenUddinAhmed,HadiBanaee,XavierRafael-Palou,andAmyLoutfi.Intelligenthealthcareservicestosupporthealthmonitoringofelderly.InInternetofthings.user-centricIoT,pp.178-186.Springer,Cham,2015.

1.6. PUBLICATIONS 13

The following publications appeared as additional results of this research, butare not included in this thesis:

• Mobyen Uddin Ahmed, Hadi Banaee, and Amy Loutfi. Health moni-toring for elderly: An application using case-based reasoning and clusteranalysis. ISRN Artificial Intelligence, vol. 2013, 2013.

• Hadi Banaee and Amy Loutfi. What I talk about when I talk about data:Descriptive modelling of data in data-to-text systems. In 1st InternationalWorkshop on Data-to-Text Generation, Edinburgh, UK, 2015.

• Mobyen Uddin Ahmed, Jesica Rivero Espinosa, Alenka Reissner, ÀlexDomingo, Hadi Banaee, Amy Loutfi, and Xavier Rafael-Palou. Self-serveict-based health monitoring to support active ageing. In 8th Interna-tional Conference on Health Informatics (HEALTHINF), Lisbon, Portu-gal, 2015.

• Mobyen Uddin Ahmed, Hadi Banaee, Xavier Rafael-Palou, and AmyLoutfi. Intelligent healthcare services to support health monitoring ofelderly. In Internet of things. user-centric IoT, pp. 178-186. Springer,Cham, 2015.

1.6.PUBLICATIONS13

Thefollowingpublicationsappearedasadditionalresultsofthisresearch,butarenotincludedinthisthesis:

•MobyenUddinAhmed,HadiBanaee,andAmyLoutfi.Healthmoni-toringforelderly:Anapplicationusingcase-basedreasoningandclusteranalysis.ISRNArtificialIntelligence,vol.2013,2013.

•HadiBanaeeandAmyLoutfi.WhatItalkaboutwhenItalkaboutdata:Descriptivemodellingofdataindata-to-textsystems.In1stInternationalWorkshoponData-to-TextGeneration,Edinburgh,UK,2015.

•MobyenUddinAhmed,JesicaRiveroEspinosa,AlenkaReissner,ÀlexDomingo,HadiBanaee,AmyLoutfi,andXavierRafael-Palou.Self-serveict-basedhealthmonitoringtosupportactiveageing.In8thInterna-tionalConferenceonHealthInformatics(HEALTHINF),Lisbon,Portu-gal,2015.

•MobyenUddinAhmed,HadiBanaee,XavierRafael-Palou,andAmyLoutfi.Intelligenthealthcareservicestosupporthealthmonitoringofelderly.InInternetofthings.user-centricIoT,pp.178-186.Springer,Cham,2015.


Thesis MindMap

Figure 1.3: Thesis MindMap, illustrating the appearance of the research tasksin the chapters, together with the contributions of this thesis.


ThesisMindMap

Figure1.3:ThesisMindMap,illustratingtheappearanceoftheresearchtasksinthechapters,togetherwiththecontributionsofthisthesis.


Thesis MindMap

Figure 1.3: Thesis MindMap, illustrating the appearance of the research tasksin the chapters, together with the contributions of this thesis.


ThesisMindMap

Figure1.3:ThesisMindMap,illustratingtheappearanceoftheresearchtasksinthechapters,togetherwiththecontributionsofthisthesis.

Part I

Creating SemanticRepresentations for Numerical

Data

Part I of this thesis presents the notion of data-driven conceptual spaces as asemantic representation tool for linguistically describing numerical data. Afteran overview of the background and the related work (Chapter 2), this part ex-plains how a conceptual space can be constructed using observed information(Chapter 3), before proceeding to explain how it can be utilised to infer se-mantics for unseen observations (Chapter 4). This process is then exemplifiedby applying the approach to a data set of leaf examples, and performing anempirical evaluation on the goodness of the derived descriptions for unknownsamples (Chapter 5).

17

PartI

CreatingSemanticRepresentationsforNumerical

Data

PartIofthisthesispresentsthenotionofdata-drivenconceptualspacesasasemanticrepresentationtoolforlinguisticallydescribingnumericaldata.Afteranoverviewofthebackgroundandtherelatedwork(Chapter2),thispartex-plainshowaconceptualspacecanbeconstructedusingobservedinformation(Chapter3),beforeproceedingtoexplainhowitcanbeutilisedtoinferse-manticsforunseenobservations(Chapter4).Thisprocessisthenexemplifiedbyapplyingtheapproachtoadatasetofleafexamples,andperforminganempiricalevaluationonthegoodnessofthederiveddescriptionsforunknownsamples(Chapter5).

17

Part I

Creating SemanticRepresentations for Numerical

Data

Part I of this thesis presents the notion of data-driven conceptual spaces as asemantic representation tool for linguistically describing numerical data. Afteran overview of the background and the related work (Chapter 2), this part ex-plains how a conceptual space can be constructed using observed information(Chapter 3), before proceeding to explain how it can be utilised to infer se-mantics for unseen observations (Chapter 4). This process is then exemplifiedby applying the approach to a data set of leaf examples, and performing anempirical evaluation on the goodness of the derived descriptions for unknownsamples (Chapter 5).

17

PartI

CreatingSemanticRepresentationsforNumerical

Data

PartIofthisthesispresentsthenotionofdata-drivenconceptualspacesasasemanticrepresentationtoolforlinguisticallydescribingnumericaldata.Afteranoverviewofthebackgroundandtherelatedwork(Chapter2),thispartex-plainshowaconceptualspacecanbeconstructedusingobservedinformation(Chapter3),beforeproceedingtoexplainhowitcanbeutilisedtoinferse-manticsforunseenobservations(Chapter4).Thisprocessisthenexemplifiedbyapplyingtheapproachtoadatasetofleafexamples,andperforminganempiricalevaluationonthegoodnessofthederiveddescriptionsforunknownsamples(Chapter5).

17

Chapter 2

Background and Related Work

“The world is my world: this is manifest in the fact thatthe limits of language (of that language which alone Iunderstand) mean the limits of my world.”

— Ludwig Wittgenstein (1889–1951)

In this chapter, the background and related work for this thesis are presented.This chapter focusses in particular on two fields: concept formation and

concept acquisition using conceptual spaces, and approaches for generating lin-guistic descriptions from data. The chapter begins with a discussion about theterm of semantic representation and the way that it has been used throughoutthis thesis. The theory of conceptual spaces then is introduced, together with adiscussion about its use in artificial intelligence research. The background andthe related work of linguistic descriptions of data (LDD) and natural languagegeneration (NLG) are presented, together with the role of knowledge acquisi-tion (KA) within the field of NLG. Moreover, a brief discussion summarises theadvantage of using conceptual spaces as a semantic representation that can belater used for to perform NLG tasks.

2.1 Semantic Representation

The notion of a semantic representation has been used in a variety of ways indifferent areas such as knowledge representation in AI, cognitive science, andphilosophy of language. Two prominent traditions for semantic representationsexist [108]. One is to study the semantics of words by representing the relationsof the words in natural language. For such representations, also called amodalapproaches [78], input is linguistic information. Some computational modelssuch as semantic space and semantic networks are examples of this kind ofsemantic representations in the field of linguistics [78, 108]. Another tradition

19

Chapter2

BackgroundandRelatedWork

“Theworldismyworld:thisismanifestinthefactthatthelimitsoflanguage(ofthatlanguagewhichaloneIunderstand)meanthelimitsofmyworld.”

—LudwigWittgenstein(1889–1951)

Inthischapter,thebackgroundandrelatedworkforthisthesisarepresented.Thischapterfocussesinparticularontwofields:conceptformationand

conceptacquisitionusingconceptualspaces,andapproachesforgeneratinglin-guisticdescriptionsfromdata.Thechapterbeginswithadiscussionaboutthetermofsemanticrepresentationandthewaythatithasbeenusedthroughoutthisthesis.Thetheoryofconceptualspacesthenisintroduced,togetherwithadiscussionaboutitsuseinartificialintelligenceresearch.Thebackgroundandtherelatedworkoflinguisticdescriptionsofdata(LDD)andnaturallanguagegeneration(NLG)arepresented,togetherwiththeroleofknowledgeacquisi-tion(KA)withinthefieldofNLG.Moreover,abriefdiscussionsummarisestheadvantageofusingconceptualspacesasasemanticrepresentationthatcanbelaterusedfortoperformNLGtasks.

2.1SemanticRepresentation

ThenotionofasemanticrepresentationhasbeenusedinavarietyofwaysindifferentareassuchasknowledgerepresentationinAI,cognitivescience,andphilosophyoflanguage.Twoprominenttraditionsforsemanticrepresentationsexist[108].Oneistostudythesemanticsofwordsbyrepresentingtherelationsofthewordsinnaturallanguage.Forsuchrepresentations,alsocalledamodalapproaches[78],inputislinguisticinformation.Somecomputationalmodelssuchassemanticspaceandsemanticnetworksareexamplesofthiskindofsemanticrepresentationsinthefieldoflinguistics[78,108].Anothertradition

19

Chapter 2

Background and Related Work

“The world is my world: this is manifest in the fact thatthe limits of language (of that language which alone Iunderstand) mean the limits of my world.”

— Ludwig Wittgenstein (1889–1951)

In this chapter, the background and related work for this thesis are presented.This chapter focusses in particular on two fields: concept formation and

concept acquisition using conceptual spaces, and approaches for generating lin-guistic descriptions from data. The chapter begins with a discussion about theterm of semantic representation and the way that it has been used throughoutthis thesis. The theory of conceptual spaces then is introduced, together with adiscussion about its use in artificial intelligence research. The background andthe related work of linguistic descriptions of data (LDD) and natural languagegeneration (NLG) are presented, together with the role of knowledge acquisi-tion (KA) within the field of NLG. Moreover, a brief discussion summarises theadvantage of using conceptual spaces as a semantic representation that can belater used for to perform NLG tasks.

2.1 Semantic Representation

The notion of a semantic representation has been used in a variety of ways indifferent areas such as knowledge representation in AI, cognitive science, andphilosophy of language. Two prominent traditions for semantic representationsexist [108]. One is to study the semantics of words by representing the relationsof the words in natural language. For such representations, also called amodalapproaches [78], input is linguistic information. Some computational modelssuch as semantic space and semantic networks are examples of this kind ofsemantic representations in the field of linguistics [78, 108]. Another tradition

19

Chapter2

BackgroundandRelatedWork

“Theworldismyworld:thisismanifestinthefactthatthelimitsoflanguage(ofthatlanguagewhichaloneIunderstand)meanthelimitsofmyworld.”

—LudwigWittgenstein(1889–1951)

Inthischapter,thebackgroundandrelatedworkforthisthesisarepresented.Thischapterfocussesinparticularontwofields:conceptformationand

conceptacquisitionusingconceptualspaces,andapproachesforgeneratinglin-guisticdescriptionsfromdata.Thechapterbeginswithadiscussionaboutthetermofsemanticrepresentationandthewaythatithasbeenusedthroughoutthisthesis.Thetheoryofconceptualspacesthenisintroduced,togetherwithadiscussionaboutitsuseinartificialintelligenceresearch.Thebackgroundandtherelatedworkoflinguisticdescriptionsofdata(LDD)andnaturallanguagegeneration(NLG)arepresented,togetherwiththeroleofknowledgeacquisi-tion(KA)withinthefieldofNLG.Moreover,abriefdiscussionsummarisestheadvantageofusingconceptualspacesasasemanticrepresentationthatcanbelaterusedfortoperformNLGtasks.

2.1SemanticRepresentation

ThenotionofasemanticrepresentationhasbeenusedinavarietyofwaysindifferentareassuchasknowledgerepresentationinAI,cognitivescience,andphilosophyoflanguage.Twoprominenttraditionsforsemanticrepresentationsexist[108].Oneistostudythesemanticsofwordsbyrepresentingtherelationsofthewordsinnaturallanguage.Forsuchrepresentations,alsocalledamodalapproaches[78],inputislinguisticinformation.Somecomputationalmodelssuchassemanticspaceandsemanticnetworksareexamplesofthiskindofsemanticrepresentationsinthefieldoflinguistics[78,108].Anothertradition

19

20 CHAPTER 2. BACKGROUND & RELATED WORK

focuses on conceptual structures for the representation of meanings, which con-siders the relations between concepts and percepts or actions to model the se-mantics. In this case of semantic representations, also called experiential [225],the input is a set of perceptual information such as sensory data, memory, etc.The origin of this kind of semantic representation is the study of cognitive se-mantics, wherein the focus is on the meaning of the concepts as a cognitivephenomenon [19]. Cognitive semantics considers the meaning of linguistic ex-pressions as mental entities coming from our perceptions. This perceptual in-formation later is formed as concepts in our mind. This point of view is againstthe realist approaches that define semantics as something out in the world [86].From the realist point of view, semantics can be represented using e.g., ab-stract propositions and description logic, and can be modelled and verified bytruth conditions. This definition is highly related to the way that semantics hasbeen used and modelled in the field of knowledge representation in AI [164].The realist approach, also called truth-conditional semantics, seeks to connectlanguage with statements about the real-world in the form of meta-languagestatements (e.g., in propositional or predicate calculus) [68, 164]. Within Cog-nitive semantics, however, meaning is a conceptual structure that comes beforethe truth [86]. Gärdenfors in his recent book, The geometry of meaning [90],includes another principle to the cognitive semantics as: ‘a semantic theory shallaccount for the relation between perceptual processes and meaning’1. In otherwords, semantic representations, from the cognitive point of view, should be aconceptual structure which represents both perceptual and linguistic informa-tion.

The notion of a semantic representation in this thesis follows the latter men-tioned definition, by first constructing a conceptual representation using percep-tual information (numerical sensor data) and linguistic information (symbolicannotations), and then inferring semantically enriched descriptions. Therefore,in this thesis, a semantic representation of knowledge, as a core issue in thefield of cognitive science, provides a conceptual structure for the meaning ofperceived concepts [108]. This kind of representation eases the task of seman-tic reasoning of the perceived information, i.e., inferring the meaning of theconcepts or words in language [14].

2.2 On the Theory of Conceptual Spaces

One approach to create semantic representations from perception is to use thetheory of conceptual spaces introduced by Gärdenfors [86] as a knowledgerepresentation framework at the conceptual level which relies on the paradigmof cognitive semantics.

1Azzouni likewise discusses this principle in his book, Semantic Perception [25], where thecontents of human perception sometimes involve semantic properties (e.g., meaningfulness). Thus,he argued that meaning is perceived, not inferred.

20CHAPTER2.BACKGROUND&RELATEDWORK

focusesonconceptualstructuresfortherepresentationofmeanings,whichcon-siderstherelationsbetweenconceptsandperceptsoractionstomodelthese-mantics.Inthiscaseofsemanticrepresentations,alsocalledexperiential[225],theinputisasetofperceptualinformationsuchassensorydata,memory,etc.Theoriginofthiskindofsemanticrepresentationisthestudyofcognitivese-mantics,whereinthefocusisonthemeaningoftheconceptsasacognitivephenomenon[19].Cognitivesemanticsconsidersthemeaningoflinguisticex-pressionsasmentalentitiescomingfromourperceptions.Thisperceptualin-formationlaterisformedasconceptsinourmind.Thispointofviewisagainsttherealistapproachesthatdefinesemanticsassomethingoutintheworld[86].Fromtherealistpointofview,semanticscanberepresentedusinge.g.,ab-stractpropositionsanddescriptionlogic,andcanbemodelledandverifiedbytruthconditions.ThisdefinitionishighlyrelatedtothewaythatsemanticshasbeenusedandmodelledinthefieldofknowledgerepresentationinAI[164].Therealistapproach,alsocalledtruth-conditionalsemantics,seekstoconnectlanguagewithstatementsaboutthereal-worldintheformofmeta-languagestatements(e.g.,inpropositionalorpredicatecalculus)[68,164].WithinCog-nitivesemantics,however,meaningisaconceptualstructurethatcomesbeforethetruth[86].Gärdenforsinhisrecentbook,Thegeometryofmeaning[90],includesanotherprincipletothecognitivesemanticsas:‘asemantictheoryshallaccountfortherelationbetweenperceptualprocessesandmeaning’1.Inotherwords,semanticrepresentations,fromthecognitivepointofview,shouldbeaconceptualstructurewhichrepresentsbothperceptualandlinguisticinforma-tion.

Thenotionofasemanticrepresentationinthisthesisfollowsthelattermen-tioneddefinition,byfirstconstructingaconceptualrepresentationusingpercep-tualinformation(numericalsensordata)andlinguisticinformation(symbolicannotations),andtheninferringsemanticallyenricheddescriptions.Therefore,inthisthesis,asemanticrepresentationofknowledge,asacoreissueinthefieldofcognitivescience,providesaconceptualstructureforthemeaningofperceivedconcepts[108].Thiskindofrepresentationeasesthetaskofseman-ticreasoningoftheperceivedinformation,i.e.,inferringthemeaningoftheconceptsorwordsinlanguage[14].

2.2OntheTheoryofConceptualSpaces

OneapproachtocreatesemanticrepresentationsfromperceptionistousethetheoryofconceptualspacesintroducedbyGärdenfors[86]asaknowledgerepresentationframeworkattheconceptuallevelwhichreliesontheparadigmofcognitivesemantics.

1Azzounilikewisediscussesthisprincipleinhisbook,SemanticPerception[25],wherethecontentsofhumanperceptionsometimesinvolvesemanticproperties(e.g.,meaningfulness).Thus,hearguedthatmeaningisperceived,notinferred.


focuses on conceptual structures for the representation of meanings, which con-siders the relations between concepts and percepts or actions to model the se-mantics. In this case of semantic representations, also called experiential [225],the input is a set of perceptual information such as sensory data, memory, etc.The origin of this kind of semantic representation is the study of cognitive se-mantics, wherein the focus is on the meaning of the concepts as a cognitivephenomenon [19]. Cognitive semantics considers the meaning of linguistic ex-pressions as mental entities coming from our perceptions. This perceptual in-formation later is formed as concepts in our mind. This point of view is againstthe realist approaches that define semantics as something out in the world [86].From the realist point of view, semantics can be represented using e.g., ab-stract propositions and description logic, and can be modelled and verified bytruth conditions. This definition is highly related to the way that semantics hasbeen used and modelled in the field of knowledge representation in AI [164].The realist approach, also called truth-conditional semantics, seeks to connectlanguage with statements about the real-world in the form of meta-languagestatements (e.g., in propositional or predicate calculus) [68, 164]. Within Cog-nitive semantics, however, meaning is a conceptual structure that comes beforethe truth [86]. Gärdenfors in his recent book, The geometry of meaning [90],includes another principle to the cognitive semantics as: ‘a semantic theory shallaccount for the relation between perceptual processes and meaning’1. In otherwords, semantic representations, from the cognitive point of view, should be aconceptual structure which represents both perceptual and linguistic informa-tion.

The notion of a semantic representation in this thesis follows the latter men-tioned definition, by first constructing a conceptual representation using percep-tual information (numerical sensor data) and linguistic information (symbolicannotations), and then inferring semantically enriched descriptions. Therefore,in this thesis, a semantic representation of knowledge, as a core issue in thefield of cognitive science, provides a conceptual structure for the meaning ofperceived concepts [108]. This kind of representation eases the task of seman-tic reasoning of the perceived information, i.e., inferring the meaning of theconcepts or words in language [14].

2.2 On the Theory of Conceptual Spaces

One approach to create semantic representations from perception is to use thetheory of conceptual spaces introduced by Gärdenfors [86] as a knowledgerepresentation framework at the conceptual level which relies on the paradigmof cognitive semantics.

1Azzouni likewise discusses this principle in his book, Semantic Perception [25], where thecontents of human perception sometimes involve semantic properties (e.g., meaningfulness). Thus,he argued that meaning is perceived, not inferred.


focusesonconceptualstructuresfortherepresentationofmeanings,whichcon-siderstherelationsbetweenconceptsandperceptsoractionstomodelthese-mantics.Inthiscaseofsemanticrepresentations,alsocalledexperiential[225],theinputisasetofperceptualinformationsuchassensorydata,memory,etc.Theoriginofthiskindofsemanticrepresentationisthestudyofcognitivese-mantics,whereinthefocusisonthemeaningoftheconceptsasacognitivephenomenon[19].Cognitivesemanticsconsidersthemeaningoflinguisticex-pressionsasmentalentitiescomingfromourperceptions.Thisperceptualin-formationlaterisformedasconceptsinourmind.Thispointofviewisagainsttherealistapproachesthatdefinesemanticsassomethingoutintheworld[86].Fromtherealistpointofview,semanticscanberepresentedusinge.g.,ab-stractpropositionsanddescriptionlogic,andcanbemodelledandverifiedbytruthconditions.ThisdefinitionishighlyrelatedtothewaythatsemanticshasbeenusedandmodelledinthefieldofknowledgerepresentationinAI[164].Therealistapproach,alsocalledtruth-conditionalsemantics,seekstoconnectlanguagewithstatementsaboutthereal-worldintheformofmeta-languagestatements(e.g.,inpropositionalorpredicatecalculus)[68,164].WithinCog-nitivesemantics,however,meaningisaconceptualstructurethatcomesbeforethetruth[86].Gärdenforsinhisrecentbook,Thegeometryofmeaning[90],includesanotherprincipletothecognitivesemanticsas:‘asemantictheoryshallaccountfortherelationbetweenperceptualprocessesandmeaning’1.Inotherwords,semanticrepresentations,fromthecognitivepointofview,shouldbeaconceptualstructurewhichrepresentsbothperceptualandlinguisticinforma-tion.

Thenotionofasemanticrepresentationinthisthesisfollowsthelattermen-tioneddefinition,byfirstconstructingaconceptualrepresentationusingpercep-tualinformation(numericalsensordata)andlinguisticinformation(symbolicannotations),andtheninferringsemanticallyenricheddescriptions.Therefore,inthisthesis,asemanticrepresentationofknowledge,asacoreissueinthefieldofcognitivescience,providesaconceptualstructureforthemeaningofperceivedconcepts[108].Thiskindofrepresentationeasesthetaskofseman-ticreasoningoftheperceivedinformation,i.e.,inferringthemeaningoftheconceptsorwordsinlanguage[14].

2.2OntheTheoryofConceptualSpaces

OneapproachtocreatesemanticrepresentationsfromperceptionistousethetheoryofconceptualspacesintroducedbyGärdenfors[86]asaknowledgerepresentationframeworkattheconceptuallevelwhichreliesontheparadigmofcognitivesemantics.

1Azzounilikewisediscussesthisprincipleinhisbook,SemanticPerception[25],wherethecontentsofhumanperceptionsometimesinvolvesemanticproperties(e.g.,meaningfulness).Thus,hearguedthatmeaningisperceived,notinferred.

2.2. ON THE THEORY OF CONCEPTUAL SPACES 21

In cognitive science, both explanatory goals and constructive goals are con-sidered. Explanatory goals aim to formulate the theories of cognition by study-ing the cognitive behaviour of humans or animals. Constructive goals aim toconstruct systems to fulfil cognitive tasks by building artefacts such as chess-playing programs and robots [87]. Both types of goals are dealing with theproblem of representations in cognitive science to model how humans under-stand concepts.

From the AI point of view, two prominent approaches have been studied forthe problem of modelling representations. Symbolic approaches are top-downrepresentations that aim to model the high-level concepts using symbol manip-ulation. Abstract concepts are labelled by symbols, and the relations betweenthe concepts are defined in a rule-based manner. Thus, inferences are logicaland often are the result of first-order operations between the symbols and con-cepts [9, 116]. Associationism approaches aim to model cognition in a similarway that the neural structure of the brain associates the properties into an as-sumed concept. Connectionism as a particular case of this approach attemptsto model the brain. This approach represents the interactions between the sim-ple units as artificial neural networks in order to model or generate complexbehaviours.

In both symbolic and associationism approaches, the main drawback is thelack of modelling various tasks of cognition such as concept learning, seman-tic similarity, and concept combination, simultaneously [9]. In the literature,conceptual spaces have been introduced as a mid-level representation model incognitive systems between the high-level symbolic representation and low-levelassociationistic representations [14]. The aim of representing knowledge in aconceptual space is to develop an intuitive interpretation of the relationshipbetween symbolic and sub-symbolic information [14,87]. This theory exploreshow various types of information can be conceptually represented, both froman explanatory perspective and for constructing an artificial system [87]. Sucha conceptual representation is the most important cognitive function, that, ac-cording to Hampton [111], “stands at the centre of the information processingflow, with input from perceptual modules of differing kinds, and is centrallyinvolved in memory, planning, decision-making, inductive inferences and muchmore besides”. The ability to perceived information on a conceptual level re-lates the theory of conceptual spaces to the considering semantic representa-tions.

Conceptual spaces are the geometric representations of how humans per-ceive, understand and learn concepts. Mainly, conceptual spaces are defined asgeometric or topological structures that model, categorise and represent con-cepts in a set of multi-dimensional domains [87, 116]. The following is the de-scription of the elements of a conceptual space as proposed by Gärdenfors [86].The formal reformulation of these elements is presented later in Chapter 3.

2.2.ONTHETHEORYOFCONCEPTUALSPACES21

Incognitivescience,bothexplanatorygoalsandconstructivegoalsarecon-sidered.Explanatorygoalsaimtoformulatethetheoriesofcognitionbystudy-ingthecognitivebehaviourofhumansoranimals.Constructivegoalsaimtoconstructsystemstofulfilcognitivetasksbybuildingartefactssuchaschess-playingprogramsandrobots[87].Bothtypesofgoalsaredealingwiththeproblemofrepresentationsincognitivesciencetomodelhowhumansunder-standconcepts.

FromtheAIpointofview,twoprominentapproacheshavebeenstudiedfortheproblemofmodellingrepresentations.Symbolicapproachesaretop-downrepresentationsthataimtomodelthehigh-levelconceptsusingsymbolmanip-ulation.Abstractconceptsarelabelledbysymbols,andtherelationsbetweentheconceptsaredefinedinarule-basedmanner.Thus,inferencesarelogicalandoftenaretheresultoffirst-orderoperationsbetweenthesymbolsandcon-cepts[9,116].Associationismapproachesaimtomodelcognitioninasimilarwaythattheneuralstructureofthebrainassociatesthepropertiesintoanas-sumedconcept.Connectionismasaparticularcaseofthisapproachattemptstomodelthebrain.Thisapproachrepresentstheinteractionsbetweenthesim-pleunitsasartificialneuralnetworksinordertomodelorgeneratecomplexbehaviours.

Inbothsymbolicandassociationismapproaches,themaindrawbackisthelackofmodellingvarioustasksofcognitionsuchasconceptlearning,seman-ticsimilarity,andconceptcombination,simultaneously[9].Intheliterature,conceptualspaceshavebeenintroducedasamid-levelrepresentationmodelincognitivesystemsbetweenthehigh-levelsymbolicrepresentationandlow-levelassociationisticrepresentations[14].Theaimofrepresentingknowledgeinaconceptualspaceistodevelopanintuitiveinterpretationoftherelationshipbetweensymbolicandsub-symbolicinformation[14,87].Thistheoryexploreshowvarioustypesofinformationcanbeconceptuallyrepresented,bothfromanexplanatoryperspectiveandforconstructinganartificialsystem[87].Suchaconceptualrepresentationisthemostimportantcognitivefunction,that,ac-cordingtoHampton[111],“standsatthecentreoftheinformationprocessingflow,withinputfromperceptualmodulesofdifferingkinds,andiscentrallyinvolvedinmemory,planning,decision-making,inductiveinferencesandmuchmorebesides”.Theabilitytoperceivedinformationonaconceptuallevelre-latesthetheoryofconceptualspacestotheconsideringsemanticrepresenta-tions.

Conceptualspacesarethegeometricrepresentationsofhowhumansper-ceive,understandandlearnconcepts.Mainly,conceptualspacesaredefinedasgeometricortopologicalstructuresthatmodel,categoriseandrepresentcon-ceptsinasetofmulti-dimensionaldomains[87,116].Thefollowingisthede-scriptionoftheelementsofaconceptualspaceasproposedbyGärdenfors[86].TheformalreformulationoftheseelementsispresentedlaterinChapter3.


In cognitive science, both explanatory goals and constructive goals are con-sidered. Explanatory goals aim to formulate the theories of cognition by study-ing the cognitive behaviour of humans or animals. Constructive goals aim toconstruct systems to fulfil cognitive tasks by building artefacts such as chess-playing programs and robots [87]. Both types of goals are dealing with theproblem of representations in cognitive science to model how humans under-stand concepts.

From the AI point of view, two prominent approaches have been studied forthe problem of modelling representations. Symbolic approaches are top-downrepresentations that aim to model the high-level concepts using symbol manip-ulation. Abstract concepts are labelled by symbols, and the relations betweenthe concepts are defined in a rule-based manner. Thus, inferences are logicaland often are the result of first-order operations between the symbols and con-cepts [9, 116]. Associationism approaches aim to model cognition in a similarway that the neural structure of the brain associates the properties into an as-sumed concept. Connectionism as a particular case of this approach attemptsto model the brain. This approach represents the interactions between the sim-ple units as artificial neural networks in order to model or generate complexbehaviours.

In both symbolic and associationism approaches, the main drawback is thelack of modelling various tasks of cognition such as concept learning, seman-tic similarity, and concept combination, simultaneously [9]. In the literature,conceptual spaces have been introduced as a mid-level representation model incognitive systems between the high-level symbolic representation and low-levelassociationistic representations [14]. The aim of representing knowledge in aconceptual space is to develop an intuitive interpretation of the relationshipbetween symbolic and sub-symbolic information [14,87]. This theory exploreshow various types of information can be conceptually represented, both froman explanatory perspective and for constructing an artificial system [87]. Sucha conceptual representation is the most important cognitive function, that, ac-cording to Hampton [111], “stands at the centre of the information processingflow, with input from perceptual modules of differing kinds, and is centrallyinvolved in memory, planning, decision-making, inductive inferences and muchmore besides”. The ability to perceived information on a conceptual level re-lates the theory of conceptual spaces to the considering semantic representa-tions.

Conceptual spaces are the geometric representations of how humans per-ceive, understand and learn concepts. Mainly, conceptual spaces are defined asgeometric or topological structures that model, categorise and represent con-cepts in a set of multi-dimensional domains [87, 116]. The following is the de-scription of the elements of a conceptual space as proposed by Gärdenfors [86].The formal reformulation of these elements is presented later in Chapter 3.


Incognitivescience,bothexplanatorygoalsandconstructivegoalsarecon-sidered.Explanatorygoalsaimtoformulatethetheoriesofcognitionbystudy-ingthecognitivebehaviourofhumansoranimals.Constructivegoalsaimtoconstructsystemstofulfilcognitivetasksbybuildingartefactssuchaschess-playingprogramsandrobots[87].Bothtypesofgoalsaredealingwiththeproblemofrepresentationsincognitivesciencetomodelhowhumansunder-standconcepts.

FromtheAIpointofview,twoprominentapproacheshavebeenstudiedfortheproblemofmodellingrepresentations.Symbolicapproachesaretop-downrepresentationsthataimtomodelthehigh-levelconceptsusingsymbolmanip-ulation.Abstractconceptsarelabelledbysymbols,andtherelationsbetweentheconceptsaredefinedinarule-basedmanner.Thus,inferencesarelogicalandoftenaretheresultoffirst-orderoperationsbetweenthesymbolsandcon-cepts[9,116].Associationismapproachesaimtomodelcognitioninasimilarwaythattheneuralstructureofthebrainassociatesthepropertiesintoanas-sumedconcept.Connectionismasaparticularcaseofthisapproachattemptstomodelthebrain.Thisapproachrepresentstheinteractionsbetweenthesim-pleunitsasartificialneuralnetworksinordertomodelorgeneratecomplexbehaviours.

Inbothsymbolicandassociationismapproaches,themaindrawbackisthelackofmodellingvarioustasksofcognitionsuchasconceptlearning,seman-ticsimilarity,andconceptcombination,simultaneously[9].Intheliterature,conceptualspaceshavebeenintroducedasamid-levelrepresentationmodelincognitivesystemsbetweenthehigh-levelsymbolicrepresentationandlow-levelassociationisticrepresentations[14].Theaimofrepresentingknowledgeinaconceptualspaceistodevelopanintuitiveinterpretationoftherelationshipbetweensymbolicandsub-symbolicinformation[14,87].Thistheoryexploreshowvarioustypesofinformationcanbeconceptuallyrepresented,bothfromanexplanatoryperspectiveandforconstructinganartificialsystem[87].Suchaconceptualrepresentationisthemostimportantcognitivefunction,that,ac-cordingtoHampton[111],“standsatthecentreoftheinformationprocessingflow,withinputfromperceptualmodulesofdifferingkinds,andiscentrallyinvolvedinmemory,planning,decision-making,inductiveinferencesandmuchmorebesides”.Theabilitytoperceivedinformationonaconceptuallevelre-latesthetheoryofconceptualspacestotheconsideringsemanticrepresenta-tions.

Conceptualspacesarethegeometricrepresentationsofhowhumansper-ceive,understandandlearnconcepts.Mainly,conceptualspacesaredefinedasgeometricortopologicalstructuresthatmodel,categoriseandrepresentcon-ceptsinasetofmulti-dimensionaldomains[87,116].Thefollowingisthede-scriptionoftheelementsofaconceptualspaceasproposedbyGärdenfors[86].TheformalreformulationoftheseelementsispresentedlaterinChapter3.


• Quality Dimensions: A conceptual space consists of a set of quality di-mensions (i.e., cognitively meaningful attributes). The quality dimensionspresent the quality attributes of objects in a metric space based on theirmeasured quality values. Some examples of quality dimensions can benotions like height, width, and depth. Quality dimensions can be eitherinterpreted as phenomenal (psychological) or scientific (theoretical) di-mensions. The psychological interpretation of quality dimensions repre-sents the phenomenal human responses in a semantically meaningful way,which are coming from human perceptions. Colour perception is a phe-nomenal example that can be described by three quality dimensions: hue,saturation, and brightness. The scientific interpretation is defined basedon scientific theories which measure the values associated with e.g., sen-sors that measure wavelengths [86]. This distinction is tightly related tothe mentioned goals of cognitive science. When the target is explanatory,the phenomenal interpretations of dimensions are in focus, and when thegoal is constructive, the scientifically modelled dimensions are consid-ered. Different scales of measurements including nominal, ordinal, in-terval, and ratio are used to assign values to the observations [86]. Thevalues of these measurements on quality dimensions can be categorical(e.g., blood type) or continuous (e.g., size). Thus, these measurementsenable the quality dimensions to calculate distance value between eachtwo measured values. Depending on the type of the features (categoricalor continuous) for the perceived information, different distance measurescan be defined for each domain, separately.

• Domains: A domain in a conceptual space is represented as a set of in-terdependent quality dimensions which are logically integrated. A typicalexample of a domain is colour which is presented as a three-dimensionalspace in Figure 2.1. Shape, taste, size, and weight are other examplesof perceptual domains. Some of the domains, like weight can be pre-sented by a single dimension. The main reason to decompose the struc-ture of conceptual spaces into domains, as Gärdenfors proposed, is toassign concepts to different quality attributes independently. As an exam-ple, an object can be independently described by its colour, without anyneed to consider its weight. According to the original definition of con-ceptual spaces, quality dimensions that depend on each other in forminga domain are considered to be integral dimensions, as opposed to sep-arable ones [86]. Thus, within a domain, one cannot logically assign avalue to one dimension without assigning values to the other dimensions.For example, a point within the colour domain cannot be defined withbrightness and hue but without saturation.

• Properties: A property is a region in a single domain. As an example,green is a property corresponding to a region in the colour domain. Nat-ural Properties are the convex regions expressing a particular attribute


•QualityDimensions:Aconceptualspaceconsistsofasetofqualitydi-mensions(i.e.,cognitivelymeaningfulattributes).Thequalitydimensionspresentthequalityattributesofobjectsinametricspacebasedontheirmeasuredqualityvalues.Someexamplesofqualitydimensionscanbenotionslikeheight,width,anddepth.Qualitydimensionscanbeeitherinterpretedasphenomenal(psychological)orscientific(theoretical)di-mensions.Thepsychologicalinterpretationofqualitydimensionsrepre-sentsthephenomenalhumanresponsesinasemanticallymeaningfulway,whicharecomingfromhumanperceptions.Colourperceptionisaphe-nomenalexamplethatcanbedescribedbythreequalitydimensions:hue,saturation,andbrightness.Thescientificinterpretationisdefinedbasedonscientifictheorieswhichmeasurethevaluesassociatedwithe.g.,sen-sorsthatmeasurewavelengths[86].Thisdistinctionistightlyrelatedtothementionedgoalsofcognitivescience.Whenthetargetisexplanatory,thephenomenalinterpretationsofdimensionsareinfocus,andwhenthegoalisconstructive,thescientificallymodelleddimensionsareconsid-ered.Differentscalesofmeasurementsincludingnominal,ordinal,in-terval,andratioareusedtoassignvaluestotheobservations[86].Thevaluesofthesemeasurementsonqualitydimensionscanbecategorical(e.g.,bloodtype)orcontinuous(e.g.,size).Thus,thesemeasurementsenablethequalitydimensionstocalculatedistancevaluebetweeneachtwomeasuredvalues.Dependingonthetypeofthefeatures(categoricalorcontinuous)fortheperceivedinformation,differentdistancemeasurescanbedefinedforeachdomain,separately.

•Domains:Adomaininaconceptualspaceisrepresentedasasetofin-terdependentqualitydimensionswhicharelogicallyintegrated.Atypicalexampleofadomainiscolourwhichispresentedasathree-dimensionalspaceinFigure2.1.Shape,taste,size,andweightareotherexamplesofperceptualdomains.Someofthedomains,likeweightcanbepre-sentedbyasingledimension.Themainreasontodecomposethestruc-tureofconceptualspacesintodomains,asGärdenforsproposed,istoassignconceptstodifferentqualityattributesindependently.Asanexam-ple,anobjectcanbeindependentlydescribedbyitscolour,withoutanyneedtoconsideritsweight.Accordingtotheoriginaldefinitionofcon-ceptualspaces,qualitydimensionsthatdependoneachotherinformingadomainareconsideredtobeintegraldimensions,asopposedtosep-arableones[86].Thus,withinadomain,onecannotlogicallyassignavaluetoonedimensionwithoutassigningvaluestotheotherdimensions.Forexample,apointwithinthecolourdomaincannotbedefinedwithbrightnessandhuebutwithoutsaturation.

•Properties:Apropertyisaregioninasingledomain.Asanexample,greenisapropertycorrespondingtoaregioninthecolourdomain.Nat-uralPropertiesaretheconvexregionsexpressingaparticularattribute


• Quality Dimensions: A conceptual space consists of a set of quality di-mensions (i.e., cognitively meaningful attributes). The quality dimensionspresent the quality attributes of objects in a metric space based on theirmeasured quality values. Some examples of quality dimensions can benotions like height, width, and depth. Quality dimensions can be eitherinterpreted as phenomenal (psychological) or scientific (theoretical) di-mensions. The psychological interpretation of quality dimensions repre-sents the phenomenal human responses in a semantically meaningful way,which are coming from human perceptions. Colour perception is a phe-nomenal example that can be described by three quality dimensions: hue,saturation, and brightness. The scientific interpretation is defined basedon scientific theories which measure the values associated with e.g., sen-sors that measure wavelengths [86]. This distinction is tightly related tothe mentioned goals of cognitive science. When the target is explanatory,the phenomenal interpretations of dimensions are in focus, and when thegoal is constructive, the scientifically modelled dimensions are consid-ered. Different scales of measurements including nominal, ordinal, in-terval, and ratio are used to assign values to the observations [86]. Thevalues of these measurements on quality dimensions can be categorical(e.g., blood type) or continuous (e.g., size). Thus, these measurementsenable the quality dimensions to calculate distance value between eachtwo measured values. Depending on the type of the features (categoricalor continuous) for the perceived information, different distance measurescan be defined for each domain, separately.

• Domains: A domain in a conceptual space is represented as a set of in-terdependent quality dimensions which are logically integrated. A typicalexample of a domain is colour which is presented as a three-dimensionalspace in Figure 2.1. Shape, taste, size, and weight are other examplesof perceptual domains. Some of the domains, like weight can be pre-sented by a single dimension. The main reason to decompose the struc-ture of conceptual spaces into domains, as Gärdenfors proposed, is toassign concepts to different quality attributes independently. As an exam-ple, an object can be independently described by its colour, without anyneed to consider its weight. According to the original definition of con-ceptual spaces, quality dimensions that depend on each other in forminga domain are considered to be integral dimensions, as opposed to sep-arable ones [86]. Thus, within a domain, one cannot logically assign avalue to one dimension without assigning values to the other dimensions.For example, a point within the colour domain cannot be defined withbrightness and hue but without saturation.

• Properties: A property is a region in a single domain. As an example,green is a property corresponding to a region in the colour domain. Nat-ural Properties are the convex regions expressing a particular attribute


•QualityDimensions:Aconceptualspaceconsistsofasetofqualitydi-mensions(i.e.,cognitivelymeaningfulattributes).Thequalitydimensionspresentthequalityattributesofobjectsinametricspacebasedontheirmeasuredqualityvalues.Someexamplesofqualitydimensionscanbenotionslikeheight,width,anddepth.Qualitydimensionscanbeeitherinterpretedasphenomenal(psychological)orscientific(theoretical)di-mensions.Thepsychologicalinterpretationofqualitydimensionsrepre-sentsthephenomenalhumanresponsesinasemanticallymeaningfulway,whicharecomingfromhumanperceptions.Colourperceptionisaphe-nomenalexamplethatcanbedescribedbythreequalitydimensions:hue,saturation,andbrightness.Thescientificinterpretationisdefinedbasedonscientifictheorieswhichmeasurethevaluesassociatedwithe.g.,sen-sorsthatmeasurewavelengths[86].Thisdistinctionistightlyrelatedtothementionedgoalsofcognitivescience.Whenthetargetisexplanatory,thephenomenalinterpretationsofdimensionsareinfocus,andwhenthegoalisconstructive,thescientificallymodelleddimensionsareconsid-ered.Differentscalesofmeasurementsincludingnominal,ordinal,in-terval,andratioareusedtoassignvaluestotheobservations[86].Thevaluesofthesemeasurementsonqualitydimensionscanbecategorical(e.g.,bloodtype)orcontinuous(e.g.,size).Thus,thesemeasurementsenablethequalitydimensionstocalculatedistancevaluebetweeneachtwomeasuredvalues.Dependingonthetypeofthefeatures(categoricalorcontinuous)fortheperceivedinformation,differentdistancemeasurescanbedefinedforeachdomain,separately.

•Domains:Adomaininaconceptualspaceisrepresentedasasetofin-terdependentqualitydimensionswhicharelogicallyintegrated.Atypicalexampleofadomainiscolourwhichispresentedasathree-dimensionalspaceinFigure2.1.Shape,taste,size,andweightareotherexamplesofperceptualdomains.Someofthedomains,likeweightcanbepre-sentedbyasingledimension.Themainreasontodecomposethestruc-tureofconceptualspacesintodomains,asGärdenforsproposed,istoassignconceptstodifferentqualityattributesindependently.Asanexam-ple,anobjectcanbeindependentlydescribedbyitscolour,withoutanyneedtoconsideritsweight.Accordingtotheoriginaldefinitionofcon-ceptualspaces,qualitydimensionsthatdependoneachotherinformingadomainareconsideredtobeintegraldimensions,asopposedtosep-arableones[86].Thus,withinadomain,onecannotlogicallyassignavaluetoonedimensionwithoutassigningvaluestotheotherdimensions.Forexample,apointwithinthecolourdomaincannotbedefinedwithbrightnessandhuebutwithoutsaturation.

•Properties:Apropertyisaregioninasingledomain.Asanexample,greenisapropertycorrespondingtoaregioninthecolourdomain.Nat-uralPropertiesaretheconvexregionsexpressingaparticularattribute


of the domain. In natural language, properties are often associated withadjectives. Grasping the properties of a domain is not necessarily an intu-itive task unless specified by the domain knowledge [86].

• Concepts: Concepts in a conceptual space are represented as a set of re-gions through multiple domains, and are modelled as n-dimensional areasin the space. A concept is described based on its properties in various do-mains. For instance, the concept of Apple in a conceptual space of fruitscan be represented as a set of regions in the colour, shape, size, taste,and weight domains respectively. In natural language, the concepts oftencorrespond to nouns. Some domains may be more salient while repre-senting specific concepts. For example, to distinguish the concept Applefrom other concepts like Pear, the colour and taste domains will be moresalient than the weight domain.

• Instances (Objects): Instances of concepts are the sets of points in theconceptual space. These points are located within the domains by takingthe values based on the quality dimensions.

Example 2.1. Assume a conceptual space for “fruits”. To define the Fruit space,one can introduce different domains such as ‘colour’, ‘taste’, ‘size’, ‘shape’,‘nutrition’, etc., where these domains are defined by quality dimensions (e.g.,‘brightness’, ‘hue’, and ‘saturation’ dimensions for the colour domain). Now,a fruit concept like apple can be presented in this space by a set of regions(i.e., properties) within the domains. Verbally, the concept of an apple involvesa ‘green’ property (a region) in the colour domain, ‘sweet-sour’ property inthe taste domain, ‘roundish’ property in the shape domain, and so forth [89].Now, one instance (or an object) of the concept apple can be presented by aset of points belonging to the regions of the concept. A ‘sweet apple’ object inthe Fruit space has a multi-point presentation that contains a point in the e.g.,sweet-sour region in the taste domain.

Following the above example, Figure 2.1 shows a schematic presentation ofa conceptual space of fruits, together with presenting the regions of the conceptof apple within the defined domains and quality dimensions.

The metric definition of domains allows one to depict the notion of seman-tic similarity in a conceptual space. Measuring the similarity robustly eases theconsideration of cognitive tasks such as concept formation, semantic inferences,induction, and concept learning [116]. Context is another notion that has beenconsidered in the theory of conceptual spaces. Since the semantics of conceptsare conceptual structures in individual minds, the meaning of elements differs invarious contexts. Thus, to formulate the context within the concept representa-tion, it is possible to assign weights to the domains or dimensions to distinguishbetween similar concepts in different contexts [87].


ofthedomain.Innaturallanguage,propertiesareoftenassociatedwithadjectives.Graspingthepropertiesofadomainisnotnecessarilyanintu-itivetaskunlessspecifiedbythedomainknowledge[86].

•Concepts:Conceptsinaconceptualspacearerepresentedasasetofre-gionsthroughmultipledomains,andaremodelledasn-dimensionalareasinthespace.Aconceptisdescribedbasedonitspropertiesinvariousdo-mains.Forinstance,theconceptofAppleinaconceptualspaceoffruitscanberepresentedasasetofregionsinthecolour,shape,size,taste,andweightdomainsrespectively.Innaturallanguage,theconceptsoftencorrespondtonouns.Somedomainsmaybemoresalientwhilerepre-sentingspecificconcepts.Forexample,todistinguishtheconceptApplefromotherconceptslikePear,thecolourandtastedomainswillbemoresalientthantheweightdomain.

•Instances(Objects):Instancesofconceptsarethesetsofpointsintheconceptualspace.Thesepointsarelocatedwithinthedomainsbytakingthevaluesbasedonthequalitydimensions.

Example2.1.Assumeaconceptualspacefor“fruits”.TodefinetheFruitspace,onecanintroducedifferentdomainssuchas‘colour’,‘taste’,‘size’,‘shape’,‘nutrition’,etc.,wherethesedomainsaredefinedbyqualitydimensions(e.g.,‘brightness’,‘hue’,and‘saturation’dimensionsforthecolourdomain).Now,afruitconceptlikeapplecanbepresentedinthisspacebyasetofregions(i.e.,properties)withinthedomains.Verbally,theconceptofanappleinvolvesa‘green’property(aregion)inthecolourdomain,‘sweet-sour’propertyinthetastedomain,‘roundish’propertyintheshapedomain,andsoforth[89].Now,oneinstance(oranobject)oftheconceptapplecanbepresentedbyasetofpointsbelongingtotheregionsoftheconcept.A‘sweetapple’objectintheFruitspacehasamulti-pointpresentationthatcontainsapointinthee.g.,sweet-sourregioninthetastedomain.

Followingtheaboveexample,Figure2.1showsaschematicpresentationofaconceptualspaceoffruits,togetherwithpresentingtheregionsoftheconceptofapplewithinthedefineddomainsandqualitydimensions.

Themetricdefinitionofdomainsallowsonetodepictthenotionofseman-ticsimilarityinaconceptualspace.Measuringthesimilarityrobustlyeasestheconsiderationofcognitivetaskssuchasconceptformation,semanticinferences,induction,andconceptlearning[116].Contextisanothernotionthathasbeenconsideredinthetheoryofconceptualspaces.Sincethesemanticsofconceptsareconceptualstructuresinindividualminds,themeaningofelementsdiffersinvariouscontexts.Thus,toformulatethecontextwithintheconceptrepresenta-tion,itispossibletoassignweightstothedomainsordimensionstodistinguishbetweensimilarconceptsindifferentcontexts[87].


of the domain. In natural language, properties are often associated withadjectives. Grasping the properties of a domain is not necessarily an intu-itive task unless specified by the domain knowledge [86].

• Concepts: Concepts in a conceptual space are represented as a set of re-gions through multiple domains, and are modelled as n-dimensional areasin the space. A concept is described based on its properties in various do-mains. For instance, the concept of Apple in a conceptual space of fruitscan be represented as a set of regions in the colour, shape, size, taste,and weight domains respectively. In natural language, the concepts oftencorrespond to nouns. Some domains may be more salient while repre-senting specific concepts. For example, to distinguish the concept Applefrom other concepts like Pear, the colour and taste domains will be moresalient than the weight domain.

• Instances (Objects): Instances of concepts are the sets of points in theconceptual space. These points are located within the domains by takingthe values based on the quality dimensions.

Example 2.1. Assume a conceptual space for “fruits”. To define the Fruit space,one can introduce different domains such as ‘colour’, ‘taste’, ‘size’, ‘shape’,‘nutrition’, etc., where these domains are defined by quality dimensions (e.g.,‘brightness’, ‘hue’, and ‘saturation’ dimensions for the colour domain). Now,a fruit concept like apple can be presented in this space by a set of regions(i.e., properties) within the domains. Verbally, the concept of an apple involvesa ‘green’ property (a region) in the colour domain, ‘sweet-sour’ property inthe taste domain, ‘roundish’ property in the shape domain, and so forth [89].Now, one instance (or an object) of the concept apple can be presented by aset of points belonging to the regions of the concept. A ‘sweet apple’ object inthe Fruit space has a multi-point presentation that contains a point in the e.g.,sweet-sour region in the taste domain.

Following the above example, Figure 2.1 shows a schematic presentation ofa conceptual space of fruits, together with presenting the regions of the conceptof apple within the defined domains and quality dimensions.

The metric definition of domains allows one to depict the notion of seman-tic similarity in a conceptual space. Measuring the similarity robustly eases theconsideration of cognitive tasks such as concept formation, semantic inferences,induction, and concept learning [116]. Context is another notion that has beenconsidered in the theory of conceptual spaces. Since the semantics of conceptsare conceptual structures in individual minds, the meaning of elements differs invarious contexts. Thus, to formulate the context within the concept representa-tion, it is possible to assign weights to the domains or dimensions to distinguishbetween similar concepts in different contexts [87].


ofthedomain.Innaturallanguage,propertiesareoftenassociatedwithadjectives.Graspingthepropertiesofadomainisnotnecessarilyanintu-itivetaskunlessspecifiedbythedomainknowledge[86].

•Concepts:Conceptsinaconceptualspacearerepresentedasasetofre-gionsthroughmultipledomains,andaremodelledasn-dimensionalareasinthespace.Aconceptisdescribedbasedonitspropertiesinvariousdo-mains.Forinstance,theconceptofAppleinaconceptualspaceoffruitscanberepresentedasasetofregionsinthecolour,shape,size,taste,andweightdomainsrespectively.Innaturallanguage,theconceptsoftencorrespondtonouns.Somedomainsmaybemoresalientwhilerepre-sentingspecificconcepts.Forexample,todistinguishtheconceptApplefromotherconceptslikePear,thecolourandtastedomainswillbemoresalientthantheweightdomain.

•Instances(Objects):Instancesofconceptsarethesetsofpointsintheconceptualspace.Thesepointsarelocatedwithinthedomainsbytakingthevaluesbasedonthequalitydimensions.

Example2.1.Assumeaconceptualspacefor“fruits”.TodefinetheFruitspace,onecanintroducedifferentdomainssuchas‘colour’,‘taste’,‘size’,‘shape’,‘nutrition’,etc.,wherethesedomainsaredefinedbyqualitydimensions(e.g.,‘brightness’,‘hue’,and‘saturation’dimensionsforthecolourdomain).Now,afruitconceptlikeapplecanbepresentedinthisspacebyasetofregions(i.e.,properties)withinthedomains.Verbally,theconceptofanappleinvolvesa‘green’property(aregion)inthecolourdomain,‘sweet-sour’propertyinthetastedomain,‘roundish’propertyintheshapedomain,andsoforth[89].Now,oneinstance(oranobject)oftheconceptapplecanbepresentedbyasetofpointsbelongingtotheregionsoftheconcept.A‘sweetapple’objectintheFruitspacehasamulti-pointpresentationthatcontainsapointinthee.g.,sweet-sourregioninthetastedomain.

Followingtheaboveexample,Figure2.1showsaschematicpresentationofaconceptualspaceoffruits,togetherwithpresentingtheregionsoftheconceptofapplewithinthedefineddomainsandqualitydimensions.

Themetricdefinitionofdomainsallowsonetodepictthenotionofseman-ticsimilarityinaconceptualspace.Measuringthesimilarityrobustlyeasestheconsiderationofcognitivetaskssuchasconceptformation,semanticinferences,induction,andconceptlearning[116].Contextisanothernotionthathasbeenconsideredinthetheoryofconceptualspaces.Sincethesemanticsofconceptsareconceptualstructuresinindividualminds,themeaningofelementsdiffersinvariouscontexts.Thus,toformulatethecontextwithintheconceptrepresenta-tion,itispossibletoassignweightstothedomainsordimensionstodistinguishbetweensimilarconceptsindifferentcontexts[87].


Figure 2.1: A schematic presentation of a conceptual space of fruits. Italso shows the representation the concept of apple as a set of regionswithin the defined domains and quality dimensions as apple = 〈‘sweet-sour’,‘ green’,‘ medium’〉.

Various formalisations of the conceptual spaces theory have been proposedin the literature, including [14, 116, 181, 189], which attempted to mathemati-cally formalise how to construct and how to perform induction within concep-tual spaces.

2.2.1 Identifying Quality Dimensions

The quality dimensions of human perceptions are revealed from the judgementsabout the similarity of concepts in a cognitive process [87]. The origin of qualitydimensions is still an open question in the field of conceptual spaces [86]. Oncethe process of constructing a conceptual space starts, as Quine noted in [176],some innate quality dimensions are needed to make concept learning possible.However, there is no unique way to specify which set of dimensions is sufficientto characterise the concepts to be learned. There are two paradigms to identifyquality dimensions: being chosen or being inferred. For a conceptual space thataims to model a scientific theory or an artificial cognitive task, quality dimen-sions are usually chosen by the developers of the theories (experts or scientist).In contrast, for phenomenal conceptual spaces, the quality dimensions need tobe inferred from the perceived behaviours of the subjects. In many developedexamples of conceptual spaces, determining the set of quality dimensions relieson the background knowledge, which comes from phenomenal (human per-ceptual) or scientific (sensory) quality dimensions [86]. Thus, in a real-worldapplication, there is a need for knowledge engineering from experts within theapplication [188].


Figure2.1:Aschematicpresentationofaconceptualspaceoffruits.Italsoshowstherepresentationtheconceptofappleasasetofregionswithinthedefineddomainsandqualitydimensionsasapple=〈‘sweet-sour’,‘green’,‘medium’〉.

Variousformalisationsoftheconceptualspacestheoryhavebeenproposedintheliterature,including[14,116,181,189],whichattemptedtomathemati-callyformalisehowtoconstructandhowtoperforminductionwithinconcep-tualspaces.

2.2.1IdentifyingQualityDimensions

Thequalitydimensionsofhumanperceptionsarerevealedfromthejudgementsaboutthesimilarityofconceptsinacognitiveprocess[87].Theoriginofqualitydimensionsisstillanopenquestioninthefieldofconceptualspaces[86].Oncetheprocessofconstructingaconceptualspacestarts,asQuinenotedin[176],someinnatequalitydimensionsareneededtomakeconceptlearningpossible.However,thereisnouniquewaytospecifywhichsetofdimensionsissufficienttocharacterisetheconceptstobelearned.Therearetwoparadigmstoidentifyqualitydimensions:beingchosenorbeinginferred.Foraconceptualspacethataimstomodelascientifictheoryoranartificialcognitivetask,qualitydimen-sionsareusuallychosenbythedevelopersofthetheories(expertsorscientist).Incontrast,forphenomenalconceptualspaces,thequalitydimensionsneedtobeinferredfromtheperceivedbehavioursofthesubjects.Inmanydevelopedexamplesofconceptualspaces,determiningthesetofqualitydimensionsreliesonthebackgroundknowledge,whichcomesfromphenomenal(humanper-ceptual)orscientific(sensory)qualitydimensions[86].Thus,inareal-worldapplication,thereisaneedforknowledgeengineeringfromexpertswithintheapplication[188].


Figure 2.1: A schematic presentation of a conceptual space of fruits. Italso shows the representation the concept of apple as a set of regionswithin the defined domains and quality dimensions as apple = 〈‘sweet-sour’,‘ green’,‘ medium’〉.

Various formalisations of the conceptual spaces theory have been proposedin the literature, including [14, 116, 181, 189], which attempted to mathemati-cally formalise how to construct and how to perform induction within concep-tual spaces.

2.2.1 Identifying Quality Dimensions

The quality dimensions of human perceptions are revealed from the judgementsabout the similarity of concepts in a cognitive process [87]. The origin of qualitydimensions is still an open question in the field of conceptual spaces [86]. Oncethe process of constructing a conceptual space starts, as Quine noted in [176],some innate quality dimensions are needed to make concept learning possible.However, there is no unique way to specify which set of dimensions is sufficientto characterise the concepts to be learned. There are two paradigms to identifyquality dimensions: being chosen or being inferred. For a conceptual space thataims to model a scientific theory or an artificial cognitive task, quality dimen-sions are usually chosen by the developers of the theories (experts or scientist).In contrast, for phenomenal conceptual spaces, the quality dimensions need tobe inferred from the perceived behaviours of the subjects. In many developedexamples of conceptual spaces, determining the set of quality dimensions relieson the background knowledge, which comes from phenomenal (human per-ceptual) or scientific (sensory) quality dimensions [86]. Thus, in a real-worldapplication, there is a need for knowledge engineering from experts within theapplication [188].


Figure2.1:Aschematicpresentationofaconceptualspaceoffruits.Italsoshowstherepresentationtheconceptofappleasasetofregionswithinthedefineddomainsandqualitydimensionsasapple=〈‘sweet-sour’,‘green’,‘medium’〉.

Variousformalisationsoftheconceptualspacestheoryhavebeenproposedintheliterature,including[14,116,181,189],whichattemptedtomathemati-callyformalisehowtoconstructandhowtoperforminductionwithinconcep-tualspaces.

2.2.1IdentifyingQualityDimensions

Thequalitydimensionsofhumanperceptionsarerevealedfromthejudgementsaboutthesimilarityofconceptsinacognitiveprocess[87].Theoriginofqualitydimensionsisstillanopenquestioninthefieldofconceptualspaces[86].Oncetheprocessofconstructingaconceptualspacestarts,asQuinenotedin[176],someinnatequalitydimensionsareneededtomakeconceptlearningpossible.However,thereisnouniquewaytospecifywhichsetofdimensionsissufficienttocharacterisetheconceptstobelearned.Therearetwoparadigmstoidentifyqualitydimensions:beingchosenorbeinginferred.Foraconceptualspacethataimstomodelascientifictheoryoranartificialcognitivetask,qualitydimen-sionsareusuallychosenbythedevelopersofthetheories(expertsorscientist).Incontrast,forphenomenalconceptualspaces,thequalitydimensionsneedtobeinferredfromtheperceivedbehavioursofthesubjects.Inmanydevelopedexamplesofconceptualspaces,determiningthesetofqualitydimensionsreliesonthebackgroundknowledge,whichcomesfromphenomenal(humanper-ceptual)orscientific(sensory)qualitydimensions[86].Thus,inareal-worldapplication,thereisaneedforknowledgeengineeringfromexpertswithintheapplication[188].


However, the issue of identifying the quality dimensions is more challengingwhen dealing with such systems where there is no prior knowledge to explainthe semantics of dimensions, or there is a lack of knowledge about relevantquality dimensions [87]. This specific challenge motivates the investigations onhow to derive the domains and quality dimensions in a data-driven manner.

An agreed point in the literature of conceptual spaces is that it is almost im-possible to provide a complete list of human perceptual quality dimensions [89].This statement emerges from the task of concept learning, wherein learning anew concept is often applicable by expanding a conceptual space with newquality dimensions [209]. To initialise a conceptual space, it would be chal-lenging to realise which set of input features from the data can be used as thequality dimensions in order to form the target concepts. The work in this thesisattempts to identify a meaningful set of features out of predefined features ina numeric data set, relying on the hypothesis that the most discriminative fea-tures of concepts (classes) are the most representative quality dimensions forbuilding a conceptual space.

2.2.2 Related Work on Conceptual Spaces and AI

As mentioned in the introduction, from the AI point of view, the aim of repre-senting knowledge in a conceptual space is to develop an intuitive interpretationof the relationship between symbolic and sub-symbolic information [14, 86].Gärdenfors has discussed thoroughly the role of conceptual spaces as a knowl-edge representation framework in AI systems [87], focusing on the tasks ofinduction and reasoning [88, 92]. Recently, Lieto et al. [147] have detailed theneed of a conceptual representation as a mid-level of knowledge representationin-between the symbolic and the sub-symbolic one. This offers cognitive archi-tectures a common language enabling the interaction between different typesof representations. Schockaert and Prade [198] have focused on the problem ofinterpolative and extrapolative inference for different properties and conceptswith the help of conceptual spaces. In addition to the theoretical AI problems,the feasibility of using conceptual spaces has been studied in various appli-cation domains of AI, such as geographical measurement [8, 199], cognitiverobotics [57, 66, 138], object recognition [46], and visual perception [56]. Arecent review [242] discusses further applications in diverse research areas (se-mantic spaces, computing meanings, and philosophical perspectives).

Concept formation tightly connects the theory of conceptual spaces to theinduction (and particularly learning) problem. The aim of many learning sys-tems is a general description of a category of observations as concepts [151].If the input of a learning algorithm takes the form of instances, attributes,and concepts, then the process of learning is called concept description [230].Instance-based learning refers to a class of learning algorithms which predicatethe labels for unseen instances based on their similarity to the nearest train-ing instances [129]. This model requires a similarity function to perform the


However,theissueofidentifyingthequalitydimensionsismorechallengingwhendealingwithsuchsystemswherethereisnopriorknowledgetoexplainthesemanticsofdimensions,orthereisalackofknowledgeaboutrelevantqualitydimensions[87].Thisspecificchallengemotivatestheinvestigationsonhowtoderivethedomainsandqualitydimensionsinadata-drivenmanner.

Anagreedpointintheliteratureofconceptualspacesisthatitisalmostim-possibletoprovideacompletelistofhumanperceptualqualitydimensions[89].Thisstatementemergesfromthetaskofconceptlearning,whereinlearninganewconceptisoftenapplicablebyexpandingaconceptualspacewithnewqualitydimensions[209].Toinitialiseaconceptualspace,itwouldbechal-lengingtorealisewhichsetofinputfeaturesfromthedatacanbeusedasthequalitydimensionsinordertoformthetargetconcepts.Theworkinthisthesisattemptstoidentifyameaningfulsetoffeaturesoutofpredefinedfeaturesinanumericdataset,relyingonthehypothesisthatthemostdiscriminativefea-turesofconcepts(classes)arethemostrepresentativequalitydimensionsforbuildingaconceptualspace.

2.2.2RelatedWorkonConceptualSpacesandAI

Asmentionedintheintroduction,fromtheAIpointofview,theaimofrepre-sentingknowledgeinaconceptualspaceistodevelopanintuitiveinterpretationoftherelationshipbetweensymbolicandsub-symbolicinformation[14,86].Gärdenforshasdiscussedthoroughlytheroleofconceptualspacesasaknowl-edgerepresentationframeworkinAIsystems[87],focusingonthetasksofinductionandreasoning[88,92].Recently,Lietoetal.[147]havedetailedtheneedofaconceptualrepresentationasamid-levelofknowledgerepresentationin-betweenthesymbolicandthesub-symbolicone.Thisofferscognitivearchi-tecturesacommonlanguageenablingtheinteractionbetweendifferenttypesofrepresentations.SchockaertandPrade[198]havefocusedontheproblemofinterpolativeandextrapolativeinferencefordifferentpropertiesandconceptswiththehelpofconceptualspaces.InadditiontothetheoreticalAIproblems,thefeasibilityofusingconceptualspaceshasbeenstudiedinvariousappli-cationdomainsofAI,suchasgeographicalmeasurement[8,199],cognitiverobotics[57,66,138],objectrecognition[46],andvisualperception[56].Arecentreview[242]discussesfurtherapplicationsindiverseresearchareas(se-manticspaces,computingmeanings,andphilosophicalperspectives).

Conceptformationtightlyconnectsthetheoryofconceptualspacestotheinduction(andparticularlylearning)problem.Theaimofmanylearningsys-temsisageneraldescriptionofacategoryofobservationsasconcepts[151].Iftheinputofalearningalgorithmtakestheformofinstances,attributes,andconcepts,thentheprocessoflearningiscalledconceptdescription[230].Instance-basedlearningreferstoaclassoflearningalgorithmswhichpredicatethelabelsforunseeninstancesbasedontheirsimilaritytothenearesttrain-inginstances[129].Thismodelrequiresasimilarityfunctiontoperformthe


However, the issue of identifying the quality dimensions is more challengingwhen dealing with such systems where there is no prior knowledge to explainthe semantics of dimensions, or there is a lack of knowledge about relevantquality dimensions [87]. This specific challenge motivates the investigations onhow to derive the domains and quality dimensions in a data-driven manner.

An agreed point in the literature of conceptual spaces is that it is almost im-possible to provide a complete list of human perceptual quality dimensions [89].This statement emerges from the task of concept learning, wherein learning anew concept is often applicable by expanding a conceptual space with newquality dimensions [209]. To initialise a conceptual space, it would be chal-lenging to realise which set of input features from the data can be used as thequality dimensions in order to form the target concepts. The work in this thesisattempts to identify a meaningful set of features out of predefined features ina numeric data set, relying on the hypothesis that the most discriminative fea-tures of concepts (classes) are the most representative quality dimensions forbuilding a conceptual space.

2.2.2 Related Work on Conceptual Spaces and AI

As mentioned in the introduction, from the AI point of view, the aim of repre-senting knowledge in a conceptual space is to develop an intuitive interpretationof the relationship between symbolic and sub-symbolic information [14, 86].Gärdenfors has discussed thoroughly the role of conceptual spaces as a knowl-edge representation framework in AI systems [87], focusing on the tasks ofinduction and reasoning [88, 92]. Recently, Lieto et al. [147] have detailed theneed of a conceptual representation as a mid-level of knowledge representationin-between the symbolic and the sub-symbolic one. This offers cognitive archi-tectures a common language enabling the interaction between different typesof representations. Schockaert and Prade [198] have focused on the problem ofinterpolative and extrapolative inference for different properties and conceptswith the help of conceptual spaces. In addition to the theoretical AI problems,the feasibility of using conceptual spaces has been studied in various appli-cation domains of AI, such as geographical measurement [8, 199], cognitiverobotics [57, 66, 138], object recognition [46], and visual perception [56]. Arecent review [242] discusses further applications in diverse research areas (se-mantic spaces, computing meanings, and philosophical perspectives).

Concept formation tightly connects the theory of conceptual spaces to theinduction (and particularly learning) problem. The aim of many learning sys-tems is a general description of a category of observations as concepts [151].If the input of a learning algorithm takes the form of instances, attributes,and concepts, then the process of learning is called concept description [230].Instance-based learning refers to a class of learning algorithms which predicatethe labels for unseen instances based on their similarity to the nearest train-ing instances [129]. This model requires a similarity function to perform the


However,theissueofidentifyingthequalitydimensionsismorechallengingwhendealingwithsuchsystemswherethereisnopriorknowledgetoexplainthesemanticsofdimensions,orthereisalackofknowledgeaboutrelevantqualitydimensions[87].Thisspecificchallengemotivatestheinvestigationsonhowtoderivethedomainsandqualitydimensionsinadata-drivenmanner.

Anagreedpointintheliteratureofconceptualspacesisthatitisalmostim-possibletoprovideacompletelistofhumanperceptualqualitydimensions[89].Thisstatementemergesfromthetaskofconceptlearning,whereinlearninganewconceptisoftenapplicablebyexpandingaconceptualspacewithnewqualitydimensions[209].Toinitialiseaconceptualspace,itwouldbechal-lengingtorealisewhichsetofinputfeaturesfromthedatacanbeusedasthequalitydimensionsinordertoformthetargetconcepts.Theworkinthisthesisattemptstoidentifyameaningfulsetoffeaturesoutofpredefinedfeaturesinanumericdataset,relyingonthehypothesisthatthemostdiscriminativefea-turesofconcepts(classes)arethemostrepresentativequalitydimensionsforbuildingaconceptualspace.

2.2.2RelatedWorkonConceptualSpacesandAI

Asmentionedintheintroduction,fromtheAIpointofview,theaimofrepre-sentingknowledgeinaconceptualspaceistodevelopanintuitiveinterpretationoftherelationshipbetweensymbolicandsub-symbolicinformation[14,86].Gärdenforshasdiscussedthoroughlytheroleofconceptualspacesasaknowl-edgerepresentationframeworkinAIsystems[87],focusingonthetasksofinductionandreasoning[88,92].Recently,Lietoetal.[147]havedetailedtheneedofaconceptualrepresentationasamid-levelofknowledgerepresentationin-betweenthesymbolicandthesub-symbolicone.Thisofferscognitivearchi-tecturesacommonlanguageenablingtheinteractionbetweendifferenttypesofrepresentations.SchockaertandPrade[198]havefocusedontheproblemofinterpolativeandextrapolativeinferencefordifferentpropertiesandconceptswiththehelpofconceptualspaces.InadditiontothetheoreticalAIproblems,thefeasibilityofusingconceptualspaceshasbeenstudiedinvariousappli-cationdomainsofAI,suchasgeographicalmeasurement[8,199],cognitiverobotics[57,66,138],objectrecognition[46],andvisualperception[56].Arecentreview[242]discussesfurtherapplicationsindiverseresearchareas(se-manticspaces,computingmeanings,andphilosophicalperspectives).

Conceptformationtightlyconnectsthetheoryofconceptualspacestotheinduction(andparticularlylearning)problem.Theaimofmanylearningsys-temsisageneraldescriptionofacategoryofobservationsasconcepts[151].Iftheinputofalearningalgorithmtakestheformofinstances,attributes,andconcepts,thentheprocessoflearningiscalledconceptdescription[230].Instance-basedlearningreferstoaclassoflearningalgorithmswhichpredicatethelabelsforunseeninstancesbasedontheirsimilaritytothenearesttrain-inginstances[129].Thismodelrequiresasimilarityfunctiontoperformthe


task of concept descriptions. However, in instance-based learning, the similar-ity functions are usually applied within a single-domain feature space [12]. Acomparison of the practicality and effectiveness in instance-based learning andconceptual spaces was presented in [140]. However, in this work, the authorsdid not include the model construction process in their discussion. In this thesis,an instance-based approach is proposed for concept formation that considersthe role of the features involved to derive a multi-domain space and representthe concepts in such a space.

Using data mining approaches in the process of deriving conceptual spaceshas been studied in a few isolated works. Keßler [131] outlined the idea of us-ing conceptual spaces to describe data, with some discussions on the possibilityof automatically generating such spaces from databases. Lee [139] proposed adata mining method coupled with conceptual spaces, which addresses cognitivetasks such as concept formation using clustering techniques. The main draw-back of these approaches is that they rely on knowing about the semantics ofa domain (i.e., an application area) beforehand in order to directly determinethe domains and the quality dimensions of a conceptual space. However, an es-sential challenge is to automatically find the most related features as integratedquality dimensions, without importing extra knowledge to the model. Moreprecisely, the data-driven side of this thesis relies on the automatically findingthe best subset of features and then grouping them into the domains, withoutexpert knowledge.

The proposed approach in this thesis holds for certain classes of problems.It explores applications wherein the input data is complicated to be inter-preted at first glance. Within such applications, the task of specifying the in-terpretable domains and dimensions based on human perceptions is no trivial.These classes of problems usually deal with raw sensor data (sometimes multi-variate data) with little or no prior knowledge about their semantics [188]. Theprocess of learning for these problems are typically performed by connectionistapproaches (i.e., neural network architectures [219]) as solution for represent-ing the relation of instances on a perceptual level [14]. But the main drawbackof such solutions is that it neglects the explainability of the involved conceptsor the interpretability of the learned model (i.e., features) from a semantic per-spective.

This thesis aims to enable domain formation of a conceptual space thatis highly data-driven. The motivation is to create a semantic model able topreserve induction and semantic inference. The motivation of the proposedapproach is to deal with a class of learning problems that need a clear inter-pretation of the overall model (not only interpretability of the decisions made),but there is no prior knowledge to specify the relations within the model alongwith a vast number of input data. Thus, this thesis presents a method to createa semantic representation of sensor data and its interpretation using conceptualspaces, which facilitates the task of explaining concepts. This facilitation is per-formed in this work by applying approaches to generate linguistic descriptions


taskofconceptdescriptions.However,ininstance-basedlearning,thesimilar-ityfunctionsareusuallyappliedwithinasingle-domainfeaturespace[12].Acomparisonofthepracticalityandeffectivenessininstance-basedlearningandconceptualspaceswaspresentedin[140].However,inthiswork,theauthorsdidnotincludethemodelconstructionprocessintheirdiscussion.Inthisthesis,aninstance-basedapproachisproposedforconceptformationthatconsiderstheroleofthefeaturesinvolvedtoderiveamulti-domainspaceandrepresenttheconceptsinsuchaspace.

Usingdataminingapproachesintheprocessofderivingconceptualspaceshasbeenstudiedinafewisolatedworks.Keßler[131]outlinedtheideaofus-ingconceptualspacestodescribedata,withsomediscussionsonthepossibilityofautomaticallygeneratingsuchspacesfromdatabases.Lee[139]proposedadataminingmethodcoupledwithconceptualspaces,whichaddressescognitivetaskssuchasconceptformationusingclusteringtechniques.Themaindraw-backoftheseapproachesisthattheyrelyonknowingaboutthesemanticsofadomain(i.e.,anapplicationarea)beforehandinordertodirectlydeterminethedomainsandthequalitydimensionsofaconceptualspace.However,anes-sentialchallengeistoautomaticallyfindthemostrelatedfeaturesasintegratedqualitydimensions,withoutimportingextraknowledgetothemodel.Moreprecisely,thedata-drivensideofthisthesisreliesontheautomaticallyfindingthebestsubsetoffeaturesandthengroupingthemintothedomains,withoutexpertknowledge.

Theproposedapproachinthisthesisholdsforcertainclassesofproblems.Itexploresapplicationswhereintheinputdataiscomplicatedtobeinter-pretedatfirstglance.Withinsuchapplications,thetaskofspecifyingthein-terpretabledomainsanddimensionsbasedonhumanperceptionsisnotrivial.Theseclassesofproblemsusuallydealwithrawsensordata(sometimesmulti-variatedata)withlittleornopriorknowledgeabouttheirsemantics[188].Theprocessoflearningfortheseproblemsaretypicallyperformedbyconnectionistapproaches(i.e.,neuralnetworkarchitectures[219])assolutionforrepresent-ingtherelationofinstancesonaperceptuallevel[14].Butthemaindrawbackofsuchsolutionsisthatitneglectstheexplainabilityoftheinvolvedconceptsortheinterpretabilityofthelearnedmodel(i.e.,features)fromasemanticper-spective.

Thisthesisaimstoenabledomainformationofaconceptualspacethatishighlydata-driven.Themotivationistocreateasemanticmodelabletopreserveinductionandsemanticinference.Themotivationoftheproposedapproachistodealwithaclassoflearningproblemsthatneedaclearinter-pretationoftheoverallmodel(notonlyinterpretabilityofthedecisionsmade),butthereisnopriorknowledgetospecifytherelationswithinthemodelalongwithavastnumberofinputdata.Thus,thisthesispresentsamethodtocreateasemanticrepresentationofsensordataanditsinterpretationusingconceptualspaces,whichfacilitatesthetaskofexplainingconcepts.Thisfacilitationisper-formedinthisworkbyapplyingapproachestogeneratelinguisticdescriptions


task of concept descriptions. However, in instance-based learning, the similar-ity functions are usually applied within a single-domain feature space [12]. Acomparison of the practicality and effectiveness in instance-based learning andconceptual spaces was presented in [140]. However, in this work, the authorsdid not include the model construction process in their discussion. In this thesis,an instance-based approach is proposed for concept formation that considersthe role of the features involved to derive a multi-domain space and representthe concepts in such a space.

Using data mining approaches in the process of deriving conceptual spaceshas been studied in a few isolated works. Keßler [131] outlined the idea of us-ing conceptual spaces to describe data, with some discussions on the possibilityof automatically generating such spaces from databases. Lee [139] proposed adata mining method coupled with conceptual spaces, which addresses cognitivetasks such as concept formation using clustering techniques. The main draw-back of these approaches is that they rely on knowing about the semantics ofa domain (i.e., an application area) beforehand in order to directly determinethe domains and the quality dimensions of a conceptual space. However, an es-sential challenge is to automatically find the most related features as integratedquality dimensions, without importing extra knowledge to the model. Moreprecisely, the data-driven side of this thesis relies on the automatically findingthe best subset of features and then grouping them into the domains, withoutexpert knowledge.

The proposed approach in this thesis holds for certain classes of problems.It explores applications wherein the input data is complicated to be inter-preted at first glance. Within such applications, the task of specifying the in-terpretable domains and dimensions based on human perceptions is no trivial.These classes of problems usually deal with raw sensor data (sometimes multi-variate data) with little or no prior knowledge about their semantics [188]. Theprocess of learning for these problems are typically performed by connectionistapproaches (i.e., neural network architectures [219]) as solution for represent-ing the relation of instances on a perceptual level [14]. But the main drawbackof such solutions is that it neglects the explainability of the involved conceptsor the interpretability of the learned model (i.e., features) from a semantic per-spective.

This thesis aims to enable domain formation of a conceptual space thatis highly data-driven. The motivation is to create a semantic model able topreserve induction and semantic inference. The motivation of the proposedapproach is to deal with a class of learning problems that need a clear inter-pretation of the overall model (not only interpretability of the decisions made),but there is no prior knowledge to specify the relations within the model alongwith a vast number of input data. Thus, this thesis presents a method to createa semantic representation of sensor data and its interpretation using conceptualspaces, which facilitates the task of explaining concepts. This facilitation is per-formed in this work by applying approaches to generate linguistic descriptions


taskofconceptdescriptions.However,ininstance-basedlearning,thesimilar-ityfunctionsareusuallyappliedwithinasingle-domainfeaturespace[12].Acomparisonofthepracticalityandeffectivenessininstance-basedlearningandconceptualspaceswaspresentedin[140].However,inthiswork,theauthorsdidnotincludethemodelconstructionprocessintheirdiscussion.Inthisthesis,aninstance-basedapproachisproposedforconceptformationthatconsiderstheroleofthefeaturesinvolvedtoderiveamulti-domainspaceandrepresenttheconceptsinsuchaspace.

Usingdataminingapproachesintheprocessofderivingconceptualspaceshasbeenstudiedinafewisolatedworks.Keßler[131]outlinedtheideaofus-ingconceptualspacestodescribedata,withsomediscussionsonthepossibilityofautomaticallygeneratingsuchspacesfromdatabases.Lee[139]proposedadataminingmethodcoupledwithconceptualspaces,whichaddressescognitivetaskssuchasconceptformationusingclusteringtechniques.Themaindraw-backoftheseapproachesisthattheyrelyonknowingaboutthesemanticsofadomain(i.e.,anapplicationarea)beforehandinordertodirectlydeterminethedomainsandthequalitydimensionsofaconceptualspace.However,anes-sentialchallengeistoautomaticallyfindthemostrelatedfeaturesasintegratedqualitydimensions,withoutimportingextraknowledgetothemodel.Moreprecisely,thedata-drivensideofthisthesisreliesontheautomaticallyfindingthebestsubsetoffeaturesandthengroupingthemintothedomains,withoutexpertknowledge.

Theproposedapproachinthisthesisholdsforcertainclassesofproblems.Itexploresapplicationswhereintheinputdataiscomplicatedtobeinter-pretedatfirstglance.Withinsuchapplications,thetaskofspecifyingthein-terpretabledomainsanddimensionsbasedonhumanperceptionsisnotrivial.Theseclassesofproblemsusuallydealwithrawsensordata(sometimesmulti-variatedata)withlittleornopriorknowledgeabouttheirsemantics[188].Theprocessoflearningfortheseproblemsaretypicallyperformedbyconnectionistapproaches(i.e.,neuralnetworkarchitectures[219])assolutionforrepresent-ingtherelationofinstancesonaperceptuallevel[14].Butthemaindrawbackofsuchsolutionsisthatitneglectstheexplainabilityoftheinvolvedconceptsortheinterpretabilityofthelearnedmodel(i.e.,features)fromasemanticper-spective.

Thisthesisaimstoenabledomainformationofaconceptualspacethatishighlydata-driven.Themotivationistocreateasemanticmodelabletopreserveinductionandsemanticinference.Themotivationoftheproposedapproachistodealwithaclassoflearningproblemsthatneedaclearinter-pretationoftheoverallmodel(notonlyinterpretabilityofthedecisionsmade),butthereisnopriorknowledgetospecifytherelationswithinthemodelalongwithavastnumberofinputdata.Thus,thisthesispresentsamethodtocreateasemanticrepresentationofsensordataanditsinterpretationusingconceptualspaces,whichfacilitatesthetaskofexplainingconcepts.Thisfacilitationisper-formedinthisworkbyapplyingapproachestogeneratelinguisticdescriptions

2.3. GENERATING LINGUISTIC DESCRIPTIONS 27

for unknown instances of concepts, using the constructed semantic represen-tations as conceptual spaces. For this reason, the next section will provide anoverview of linguistic description and natural language generation approachesthat have been addressed in the literature.

2.3 Generating Linguistic Descriptions

In general, the task of generating linguistic descriptions is the process of gen-erating understandable information in the form of natural linguistic expres-sions for the target user. This task is addressed in the literature in two researchfields: linguistic descriptions of data (LDD), and natural language generation(NLG) [178]. LDD has its roots in fuzzy set theory and deals with summarisingperception-based attributes of numeric data set using linguistic characterisa-tions defined by fuzzy sets that are able to deal with the imprecision of humanlanguage. NLG, on the other hand, aims to automatically generate natural lan-guage text in the form of sentences that are as close as possible to human cre-ated text. Both of LDD and NLG fields contain elements which are relevant tothis thesis. Herein, the definitions and the usage of these fields are presented.

2.3.1 Linguistic Descriptions of Data (LDD)

The field of linguistic descriptions of data has emerged from the use of fuzzyset theory and soft computing to perform linguistic computations on data. Thistask studies the necessity of automatically describing numeric data sets by em-ploying a set of linguistic terms. Fuzzy set theory is a well-studied approach tobridge between numeric and linguistic information, specifically in perception-based systems [35, 126]. The basic idea of linguistic descriptions comes fromthe works of Zadeh [239] and Yager [233] on developing the paradigms ofcomputing with words [239], and later, the computational theory of percep-tion [238, 241], which express the ability of computing systems in a linguisticmanner [73]. Within these paradigms, developed approaches are often based onfuzzy quantification models [70] to generate simple linguistic summaries on thevariables, such as “most of the apples are red”. Although there is no high-levelabstraction to present a generic model of linguistic descriptions, Ramos-Soto etal., [178] have listed the essential elements of linguistic description approachesas follows:

• Input data, usually including numerical and sequential data, which rep-resents temporal or spatial properties. Temperature, height, weight, andsize are some examples of input data. The input data is also called nu-meric variables or numeric properties of a domain.

• Linguistic variables, defining the fuzzy sets on the provided input data tocategorise and annotate the input variable. Linguistic variables are tightly

2.3.GENERATINGLINGUISTICDESCRIPTIONS27

forunknowninstancesofconcepts,usingtheconstructedsemanticrepresen-tationsasconceptualspaces.Forthisreason,thenextsectionwillprovideanoverviewoflinguisticdescriptionandnaturallanguagegenerationapproachesthathavebeenaddressedintheliterature.

2.3GeneratingLinguisticDescriptions

Ingeneral,thetaskofgeneratinglinguisticdescriptionsistheprocessofgen-eratingunderstandableinformationintheformofnaturallinguisticexpres-sionsforthetargetuser.Thistaskisaddressedintheliteratureintworesearchfields:linguisticdescriptionsofdata(LDD),andnaturallanguagegeneration(NLG)[178].LDDhasitsrootsinfuzzysettheoryanddealswithsummarisingperception-basedattributesofnumericdatasetusinglinguisticcharacterisa-tionsdefinedbyfuzzysetsthatareabletodealwiththeimprecisionofhumanlanguage.NLG,ontheotherhand,aimstoautomaticallygeneratenaturallan-guagetextintheformofsentencesthatareascloseaspossibletohumancre-atedtext.BothofLDDandNLGfieldscontainelementswhicharerelevanttothisthesis.Herein,thedefinitionsandtheusageofthesefieldsarepresented.

2.3.1LinguisticDescriptionsofData(LDD)

Thefieldoflinguisticdescriptionsofdatahasemergedfromtheuseoffuzzysettheoryandsoftcomputingtoperformlinguisticcomputationsondata.Thistaskstudiesthenecessityofautomaticallydescribingnumericdatasetsbyem-ployingasetoflinguisticterms.Fuzzysettheoryisawell-studiedapproachtobridgebetweennumericandlinguisticinformation,specificallyinperception-basedsystems[35,126].ThebasicideaoflinguisticdescriptionscomesfromtheworksofZadeh[239]andYager[233]ondevelopingtheparadigmsofcomputingwithwords[239],andlater,thecomputationaltheoryofpercep-tion[238,241],whichexpresstheabilityofcomputingsystemsinalinguisticmanner[73].Withintheseparadigms,developedapproachesareoftenbasedonfuzzyquantificationmodels[70]togeneratesimplelinguisticsummariesonthevariables,suchas“mostoftheapplesarered”.Althoughthereisnohigh-levelabstractiontopresentagenericmodeloflinguisticdescriptions,Ramos-Sotoetal.,[178]havelistedtheessentialelementsoflinguisticdescriptionapproachesasfollows:

•Inputdata,usuallyincludingnumericalandsequentialdata,whichrep-resentstemporalorspatialproperties.Temperature,height,weight,andsizearesomeexamplesofinputdata.Theinputdataisalsocallednu-mericvariablesornumericpropertiesofadomain.

•Linguisticvariables,definingthefuzzysetsontheprovidedinputdatatocategoriseandannotatetheinputvariable.Linguisticvariablesaretightly


for unknown instances of concepts, using the constructed semantic represen-tations as conceptual spaces. For this reason, the next section will provide anoverview of linguistic description and natural language generation approachesthat have been addressed in the literature.

2.3 Generating Linguistic Descriptions

In general, the task of generating linguistic descriptions is the process of gen-erating understandable information in the form of natural linguistic expres-sions for the target user. This task is addressed in the literature in two researchfields: linguistic descriptions of data (LDD), and natural language generation(NLG) [178]. LDD has its roots in fuzzy set theory and deals with summarisingperception-based attributes of numeric data set using linguistic characterisa-tions defined by fuzzy sets that are able to deal with the imprecision of humanlanguage. NLG, on the other hand, aims to automatically generate natural lan-guage text in the form of sentences that are as close as possible to human cre-ated text. Both of LDD and NLG fields contain elements which are relevant tothis thesis. Herein, the definitions and the usage of these fields are presented.

2.3.1 Linguistic Descriptions of Data (LDD)

The field of linguistic descriptions of data has emerged from the use of fuzzyset theory and soft computing to perform linguistic computations on data. Thistask studies the necessity of automatically describing numeric data sets by em-ploying a set of linguistic terms. Fuzzy set theory is a well-studied approach tobridge between numeric and linguistic information, specifically in perception-based systems [35, 126]. The basic idea of linguistic descriptions comes fromthe works of Zadeh [239] and Yager [233] on developing the paradigms ofcomputing with words [239], and later, the computational theory of percep-tion [238, 241], which express the ability of computing systems in a linguisticmanner [73]. Within these paradigms, developed approaches are often based onfuzzy quantification models [70] to generate simple linguistic summaries on thevariables, such as “most of the apples are red”. Although there is no high-levelabstraction to present a generic model of linguistic descriptions, Ramos-Soto etal., [178] have listed the essential elements of linguistic description approachesas follows:

• Input data, usually including numerical and sequential data, which rep-resents temporal or spatial properties. Temperature, height, weight, andsize are some examples of input data. The input data is also called nu-meric variables or numeric properties of a domain.

• Linguistic variables, defining the fuzzy sets on the provided input data tocategorise and annotate the input variable. Linguistic variables are tightly


forunknowninstancesofconcepts,usingtheconstructedsemanticrepresen-tationsasconceptualspaces.Forthisreason,thenextsectionwillprovideanoverviewoflinguisticdescriptionandnaturallanguagegenerationapproachesthathavebeenaddressedintheliterature.

2.3GeneratingLinguisticDescriptions

Ingeneral,thetaskofgeneratinglinguisticdescriptionsistheprocessofgen-eratingunderstandableinformationintheformofnaturallinguisticexpres-sionsforthetargetuser.Thistaskisaddressedintheliteratureintworesearchfields:linguisticdescriptionsofdata(LDD),andnaturallanguagegeneration(NLG)[178].LDDhasitsrootsinfuzzysettheoryanddealswithsummarisingperception-basedattributesofnumericdatasetusinglinguisticcharacterisa-tionsdefinedbyfuzzysetsthatareabletodealwiththeimprecisionofhumanlanguage.NLG,ontheotherhand,aimstoautomaticallygeneratenaturallan-guagetextintheformofsentencesthatareascloseaspossibletohumancre-atedtext.BothofLDDandNLGfieldscontainelementswhicharerelevanttothisthesis.Herein,thedefinitionsandtheusageofthesefieldsarepresented.

2.3.1LinguisticDescriptionsofData(LDD)

Thefieldoflinguisticdescriptionsofdatahasemergedfromtheuseoffuzzysettheoryandsoftcomputingtoperformlinguisticcomputationsondata.Thistaskstudiesthenecessityofautomaticallydescribingnumericdatasetsbyem-ployingasetoflinguisticterms.Fuzzysettheoryisawell-studiedapproachtobridgebetweennumericandlinguisticinformation,specificallyinperception-basedsystems[35,126].ThebasicideaoflinguisticdescriptionscomesfromtheworksofZadeh[239]andYager[233]ondevelopingtheparadigmsofcomputingwithwords[239],andlater,thecomputationaltheoryofpercep-tion[238,241],whichexpresstheabilityofcomputingsystemsinalinguisticmanner[73].Withintheseparadigms,developedapproachesareoftenbasedonfuzzyquantificationmodels[70]togeneratesimplelinguisticsummariesonthevariables,suchas“mostoftheapplesarered”.Althoughthereisnohigh-levelabstractiontopresentagenericmodeloflinguisticdescriptions,Ramos-Sotoetal.,[178]havelistedtheessentialelementsoflinguisticdescriptionapproachesasfollows:

•Inputdata,usuallyincludingnumericalandsequentialdata,whichrep-resentstemporalorspatialproperties.Temperature,height,weight,andsizearesomeexamplesofinputdata.Theinputdataisalsocallednu-mericvariablesornumericpropertiesofadomain.

•Linguisticvariables,definingthefuzzysetsontheprovidedinputdatatocategoriseandannotatetheinputvariable.Linguisticvariablesaretightly


related to the definition of fuzzy granulation of input variables. An exam-ple of such fuzzy granulation to map the numerical values of an inputvariable like size is a set of linguistic labels (i.e., words) such as small,medium, or large. These linguistic variables are characterised by fuzzyintervals with non-determined boundaries using fuzzy membership func-tions. (More details will be presented in Section 4.2.2).

• Fuzzy quantifiers, are the fuzzy granulations which lead to provide propo-sitions like low, increasing, and significant for the input variables [240].These quantifiers are also characterised by fuzzy membership functions.

• Evaluation criteria, defined to assess the appropriation of the generateddescriptions based on several criteria such as data coverage degree, fulfil-ment degree, relevance and the length of the descriptions.

The algorithms that are employed in linguistic descriptions of data are in-fluenced by fuzzy techniques to construct quantified sentences. LDD methodsproduce all the possible combinations of the sentences using the provided quan-tifiers and linguistic variables for a set of input data. Depending on the com-plexity of the domain, type-I or type-II of quantified sentences (which are thestatements with simple or complex relations between quantifiers and variables,respectively [44]) may be employed [70, 163]. Afterwards, the generated sen-tences are ranked, added or removed from the output text based on the definedcriteria. The generated linguistic sentences are simple from the natural languagepoint of view with usually template-based messages to handle the sentences.

The theoretical works on developing linguistic descriptions started with thetopic of fuzzy quantification as the task of computing with words [239]. Thework of Kacprzyk [125, 126] introduced a way to relate the computation ofwords in fuzzy logic to an implementable linguistic summarisation of data.According to [126], a linguistic summarisation of a data set consists of a sum-mariser S (e.g., young), a quantity agreement Q (e.g., most), and a truth degreeT . Then, an abstract prototype of a linguistic summary can be in the form“Q Y’s are S”, where Y is a set of observed objects. As an example, “mostof the employees are young” is the result of a linguistic summarisation us-ing fuzzy logic [126]. Practically, linguistic descriptions are applied in severalapplications. The granular linguistic model of phenomena (GLMP) [224] is ageneral framework that works based on applying fuzzy rules on a set of com-putational perceptions as inputs. This solution is employed for a verity of ap-plications [178]. In [49], the concept of linguistic summarisation is employedto fulfil the precision and brevity on data related to the patient inflow, wherethe descriptions are for example: “most of the days of June, patient inflowis medium”. The work in [179] presents the use of LDD in meteorology togenerate monthly reports emphasising related contrast descriptions, e.g.: “Thetemperature was high for October, ... with very cold temperatures during thefourth week”.


relatedtothedefinitionoffuzzygranulationofinputvariables.Anexam-pleofsuchfuzzygranulationtomapthenumericalvaluesofaninputvariablelikesizeisasetoflinguisticlabels(i.e.,words)suchassmall,medium,orlarge.Theselinguisticvariablesarecharacterisedbyfuzzyintervalswithnon-determinedboundariesusingfuzzymembershipfunc-tions.(MoredetailswillbepresentedinSection4.2.2).

•Fuzzyquantifiers,arethefuzzygranulationswhichleadtoprovidepropo-sitionslikelow,increasing,andsignificantfortheinputvariables[240].Thesequantifiersarealsocharacterisedbyfuzzymembershipfunctions.

•Evaluationcriteria,definedtoassesstheappropriationofthegenerateddescriptionsbasedonseveralcriteriasuchasdatacoveragedegree,fulfil-mentdegree,relevanceandthelengthofthedescriptions.

Thealgorithmsthatareemployedinlinguisticdescriptionsofdataarein-fluencedbyfuzzytechniquestoconstructquantifiedsentences.LDDmethodsproduceallthepossiblecombinationsofthesentencesusingtheprovidedquan-tifiersandlinguisticvariablesforasetofinputdata.Dependingonthecom-plexityofthedomain,type-Iortype-IIofquantifiedsentences(whicharethestatementswithsimpleorcomplexrelationsbetweenquantifiersandvariables,respectively[44])maybeemployed[70,163].Afterwards,thegeneratedsen-tencesareranked,addedorremovedfromtheoutputtextbasedonthedefinedcriteria.Thegeneratedlinguisticsentencesaresimplefromthenaturallanguagepointofviewwithusuallytemplate-basedmessagestohandlethesentences.

Thetheoreticalworksondevelopinglinguisticdescriptionsstartedwiththetopicoffuzzyquantificationasthetaskofcomputingwithwords[239].TheworkofKacprzyk[125,126]introducedawaytorelatethecomputationofwordsinfuzzylogictoanimplementablelinguisticsummarisationofdata.Accordingto[126],alinguisticsummarisationofadatasetconsistsofasum-mariserS(e.g.,young),aquantityagreementQ(e.g.,most),andatruthdegreeT.Then,anabstractprototypeofalinguisticsummarycanbeintheform“QY’sareS”,whereYisasetofobservedobjects.Asanexample,“mostoftheemployeesareyoung”istheresultofalinguisticsummarisationus-ingfuzzylogic[126].Practically,linguisticdescriptionsareappliedinseveralapplications.Thegranularlinguisticmodelofphenomena(GLMP)[224]isageneralframeworkthatworksbasedonapplyingfuzzyrulesonasetofcom-putationalperceptionsasinputs.Thissolutionisemployedforaverityofap-plications[178].In[49],theconceptoflinguisticsummarisationisemployedtofulfiltheprecisionandbrevityondatarelatedtothepatientinflow,wherethedescriptionsareforexample:“mostofthedaysofJune,patientinflowismedium”.Theworkin[179]presentstheuseofLDDinmeteorologytogeneratemonthlyreportsemphasisingrelatedcontrastdescriptions,e.g.:“ThetemperaturewashighforOctober,...withverycoldtemperaturesduringthefourthweek”.


related to the definition of fuzzy granulation of input variables. An exam-ple of such fuzzy granulation to map the numerical values of an inputvariable like size is a set of linguistic labels (i.e., words) such as small,medium, or large. These linguistic variables are characterised by fuzzyintervals with non-determined boundaries using fuzzy membership func-tions. (More details will be presented in Section 4.2.2).

• Fuzzy quantifiers, are the fuzzy granulations which lead to provide propo-sitions like low, increasing, and significant for the input variables [240].These quantifiers are also characterised by fuzzy membership functions.

• Evaluation criteria, defined to assess the appropriation of the generateddescriptions based on several criteria such as data coverage degree, fulfil-ment degree, relevance and the length of the descriptions.

The algorithms that are employed in linguistic descriptions of data are in-fluenced by fuzzy techniques to construct quantified sentences. LDD methodsproduce all the possible combinations of the sentences using the provided quan-tifiers and linguistic variables for a set of input data. Depending on the com-plexity of the domain, type-I or type-II of quantified sentences (which are thestatements with simple or complex relations between quantifiers and variables,respectively [44]) may be employed [70, 163]. Afterwards, the generated sen-tences are ranked, added or removed from the output text based on the definedcriteria. The generated linguistic sentences are simple from the natural languagepoint of view with usually template-based messages to handle the sentences.

The theoretical works on developing linguistic descriptions started with thetopic of fuzzy quantification as the task of computing with words [239]. Thework of Kacprzyk [125, 126] introduced a way to relate the computation ofwords in fuzzy logic to an implementable linguistic summarisation of data.According to [126], a linguistic summarisation of a data set consists of a sum-mariser S (e.g., young), a quantity agreement Q (e.g., most), and a truth degreeT . Then, an abstract prototype of a linguistic summary can be in the form“Q Y’s are S”, where Y is a set of observed objects. As an example, “mostof the employees are young” is the result of a linguistic summarisation us-ing fuzzy logic [126]. Practically, linguistic descriptions are applied in severalapplications. The granular linguistic model of phenomena (GLMP) [224] is ageneral framework that works based on applying fuzzy rules on a set of com-putational perceptions as inputs. This solution is employed for a verity of ap-plications [178]. In [49], the concept of linguistic summarisation is employedto fulfil the precision and brevity on data related to the patient inflow, wherethe descriptions are for example: “most of the days of June, patient inflowis medium”. The work in [179] presents the use of LDD in meteorology togenerate monthly reports emphasising related contrast descriptions, e.g.: “Thetemperature was high for October, ... with very cold temperatures during thefourth week”.


relatedtothedefinitionoffuzzygranulationofinputvariables.Anexam-pleofsuchfuzzygranulationtomapthenumericalvaluesofaninputvariablelikesizeisasetoflinguisticlabels(i.e.,words)suchassmall,medium,orlarge.Theselinguisticvariablesarecharacterisedbyfuzzyintervalswithnon-determinedboundariesusingfuzzymembershipfunc-tions.(MoredetailswillbepresentedinSection4.2.2).

•Fuzzyquantifiers,arethefuzzygranulationswhichleadtoprovidepropo-sitionslikelow,increasing,andsignificantfortheinputvariables[240].Thesequantifiersarealsocharacterisedbyfuzzymembershipfunctions.

•Evaluationcriteria,definedtoassesstheappropriationofthegenerateddescriptionsbasedonseveralcriteriasuchasdatacoveragedegree,fulfil-mentdegree,relevanceandthelengthofthedescriptions.

Thealgorithmsthatareemployedinlinguisticdescriptionsofdataarein-fluencedbyfuzzytechniquestoconstructquantifiedsentences.LDDmethodsproduceallthepossiblecombinationsofthesentencesusingtheprovidedquan-tifiersandlinguisticvariablesforasetofinputdata.Dependingonthecom-plexityofthedomain,type-Iortype-IIofquantifiedsentences(whicharethestatementswithsimpleorcomplexrelationsbetweenquantifiersandvariables,respectively[44])maybeemployed[70,163].Afterwards,thegeneratedsen-tencesareranked,addedorremovedfromtheoutputtextbasedonthedefinedcriteria.Thegeneratedlinguisticsentencesaresimplefromthenaturallanguagepointofviewwithusuallytemplate-basedmessagestohandlethesentences.

Thetheoreticalworksondevelopinglinguisticdescriptionsstartedwiththetopicoffuzzyquantificationasthetaskofcomputingwithwords[239].TheworkofKacprzyk[125,126]introducedawaytorelatethecomputationofwordsinfuzzylogictoanimplementablelinguisticsummarisationofdata.Accordingto[126],alinguisticsummarisationofadatasetconsistsofasum-mariserS(e.g.,young),aquantityagreementQ(e.g.,most),andatruthdegreeT.Then,anabstractprototypeofalinguisticsummarycanbeintheform“QY’sareS”,whereYisasetofobservedobjects.Asanexample,“mostoftheemployeesareyoung”istheresultofalinguisticsummarisationus-ingfuzzylogic[126].Practically,linguisticdescriptionsareappliedinseveralapplications.Thegranularlinguisticmodelofphenomena(GLMP)[224]isageneralframeworkthatworksbasedonapplyingfuzzyrulesonasetofcom-putationalperceptionsasinputs.Thissolutionisemployedforaverityofap-plications[178].In[49],theconceptoflinguisticsummarisationisemployedtofulfiltheprecisionandbrevityondatarelatedtothepatientinflow,wherethedescriptionsareforexample:“mostofthedaysofJune,patientinflowismedium”.Theworkin[179]presentstheuseofLDDinmeteorologytogeneratemonthlyreportsemphasisingrelatedcontrastdescriptions,e.g.:“ThetemperaturewashighforOctober,...withverycoldtemperaturesduringthefourthweek”.


2.3.2 Natural Language Generation (NLG)

Studies on generating linguistic descriptions also encompass the field of naturallanguage generation (NLG). An NLG system aims to generate human-readablenatural language from non-linguistic information [185]. Typically, NLG sys-tems generate text based on acquired knowledge about both language and theapplication domain. Based on the requirements of the system, NLG solutionsfollow two different goals: 1) Automatically generating useful text from non-textual inputs to comply with a specific set of needs, and/or 2) Automaticallyproducing human-like text from non-textual inputs, to simulate an alreadyknown corpus of human-written text.

While a variety of NLG architectures and implementations are proposed inthe literature, a generic architecture for an NLG system has been formulatedand presented by Reiter and Dale [184,185] based on the fact that the primarytask of a natural language generation system is to convert acquired knowledgefrom underlying non-linguistic data into an understandable set of messages asoutput text. The proposed architecture consists of three main modules: docu-ment planning, microplanning, and realisation. Each of these modules performsa set of tasks related to the goal of the system. Table 2.1 depicts the modulesand their corresponding tasks in a typical NLG system. The main content tasksof an NLG system are shortly described as follows. Note that in this thesis,the use of NLG systems is mostly limited to the task of content determination,where its related work will be further explained.

• Content determination is the task to decide what information will ap-pear in the output text, based on the provided input information. Contentdetermination is the most relevant aspect of NLG regarding the linguis-tic characterisation of numerical data [187, 237]. This task determineswhether or not an acquired set of numerical information is able to berepresented linguistically and be semantically labelled to appear in thefinal text. Thus, the content determination is the core of bridging fromnumerical data to the semantic representations in NLG systems. A recentsurvey on the task of content determination and content selection can befound in [103].

Table 2.1: Typical modules and tasks of an NLG system [185].

Modules Content task Structure task

Document Planning Content determination Document structuringMicroplanning

Lixicalisation; Referringexpression generation

Aggregation

Realisation Linguistic realisation Structure realisation


2.3.2NaturalLanguageGeneration(NLG)

Studiesongeneratinglinguisticdescriptionsalsoencompassthefieldofnaturallanguagegeneration(NLG).AnNLGsystemaimstogeneratehuman-readablenaturallanguagefromnon-linguisticinformation[185].Typically,NLGsys-temsgeneratetextbasedonacquiredknowledgeaboutbothlanguageandtheapplicationdomain.Basedontherequirementsofthesystem,NLGsolutionsfollowtwodifferentgoals:1)Automaticallygeneratingusefultextfromnon-textualinputstocomplywithaspecificsetofneeds,and/or2)Automaticallyproducinghuman-liketextfromnon-textualinputs,tosimulateanalreadyknowncorpusofhuman-writtentext.

WhileavarietyofNLGarchitecturesandimplementationsareproposedintheliterature,agenericarchitectureforanNLGsystemhasbeenformulatedandpresentedbyReiterandDale[184,185]basedonthefactthattheprimarytaskofanaturallanguagegenerationsystemistoconvertacquiredknowledgefromunderlyingnon-linguisticdataintoanunderstandablesetofmessagesasoutputtext.Theproposedarchitectureconsistsofthreemainmodules:docu-mentplanning,microplanning,andrealisation.Eachofthesemodulesperformsasetoftasksrelatedtothegoalofthesystem.Table2.1depictsthemodulesandtheircorrespondingtasksinatypicalNLGsystem.ThemaincontenttasksofanNLGsystemareshortlydescribedasfollows.Notethatinthisthesis,theuseofNLGsystemsismostlylimitedtothetaskofcontentdetermination,whereitsrelatedworkwillbefurtherexplained.

•Contentdeterminationisthetasktodecidewhatinformationwillap-pearintheoutputtext,basedontheprovidedinputinformation.ContentdeterminationisthemostrelevantaspectofNLGregardingthelinguis-ticcharacterisationofnumericaldata[187,237].Thistaskdetermineswhetherornotanacquiredsetofnumericalinformationisabletoberepresentedlinguisticallyandbesemanticallylabelledtoappearinthefinaltext.Thus,thecontentdeterminationisthecoreofbridgingfromnumericaldatatothesemanticrepresentationsinNLGsystems.Arecentsurveyonthetaskofcontentdeterminationandcontentselectioncanbefoundin[103].

Table2.1:TypicalmodulesandtasksofanNLGsystem[185].

ModulesContenttaskStructuretask

DocumentPlanningContentdeterminationDocumentstructuringMicroplanning

Lixicalisation;Referringexpressiongeneration

Aggregation

RealisationLinguisticrealisationStructurerealisation


2.3.2 Natural Language Generation (NLG)

Studies on generating linguistic descriptions also encompass the field of naturallanguage generation (NLG). An NLG system aims to generate human-readablenatural language from non-linguistic information [185]. Typically, NLG sys-tems generate text based on acquired knowledge about both language and theapplication domain. Based on the requirements of the system, NLG solutionsfollow two different goals: 1) Automatically generating useful text from non-textual inputs to comply with a specific set of needs, and/or 2) Automaticallyproducing human-like text from non-textual inputs, to simulate an alreadyknown corpus of human-written text.

While a variety of NLG architectures and implementations are proposed inthe literature, a generic architecture for an NLG system has been formulatedand presented by Reiter and Dale [184,185] based on the fact that the primarytask of a natural language generation system is to convert acquired knowledgefrom underlying non-linguistic data into an understandable set of messages asoutput text. The proposed architecture consists of three main modules: docu-ment planning, microplanning, and realisation. Each of these modules performsa set of tasks related to the goal of the system. Table 2.1 depicts the modulesand their corresponding tasks in a typical NLG system. The main content tasksof an NLG system are shortly described as follows. Note that in this thesis,the use of NLG systems is mostly limited to the task of content determination,where its related work will be further explained.

• Content determination is the task to decide what information will ap-pear in the output text, based on the provided input information. Contentdetermination is the most relevant aspect of NLG regarding the linguis-tic characterisation of numerical data [187, 237]. This task determineswhether or not an acquired set of numerical information is able to berepresented linguistically and be semantically labelled to appear in thefinal text. Thus, the content determination is the core of bridging fromnumerical data to the semantic representations in NLG systems. A recentsurvey on the task of content determination and content selection can befound in [103].

Table 2.1: Typical modules and tasks of an NLG system [185].

Modules Content task Structure task

Document Planning Content determination Document structuringMicroplanning

Lixicalisation; Referringexpression generation

Aggregation

Realisation Linguistic realisation Structure realisation


2.3.2NaturalLanguageGeneration(NLG)

Studiesongeneratinglinguisticdescriptionsalsoencompassthefieldofnaturallanguagegeneration(NLG).AnNLGsystemaimstogeneratehuman-readablenaturallanguagefromnon-linguisticinformation[185].Typically,NLGsys-temsgeneratetextbasedonacquiredknowledgeaboutbothlanguageandtheapplicationdomain.Basedontherequirementsofthesystem,NLGsolutionsfollowtwodifferentgoals:1)Automaticallygeneratingusefultextfromnon-textualinputstocomplywithaspecificsetofneeds,and/or2)Automaticallyproducinghuman-liketextfromnon-textualinputs,tosimulateanalreadyknowncorpusofhuman-writtentext.

WhileavarietyofNLGarchitecturesandimplementationsareproposedintheliterature,agenericarchitectureforanNLGsystemhasbeenformulatedandpresentedbyReiterandDale[184,185]basedonthefactthattheprimarytaskofanaturallanguagegenerationsystemistoconvertacquiredknowledgefromunderlyingnon-linguisticdataintoanunderstandablesetofmessagesasoutputtext.Theproposedarchitectureconsistsofthreemainmodules:docu-mentplanning,microplanning,andrealisation.Eachofthesemodulesperformsasetoftasksrelatedtothegoalofthesystem.Table2.1depictsthemodulesandtheircorrespondingtasksinatypicalNLGsystem.ThemaincontenttasksofanNLGsystemareshortlydescribedasfollows.Notethatinthisthesis,theuseofNLGsystemsismostlylimitedtothetaskofcontentdetermination,whereitsrelatedworkwillbefurtherexplained.

•Contentdeterminationisthetasktodecidewhatinformationwillap-pearintheoutputtext,basedontheprovidedinputinformation.ContentdeterminationisthemostrelevantaspectofNLGregardingthelinguis-ticcharacterisationofnumericaldata[187,237].Thistaskdetermineswhetherornotanacquiredsetofnumericalinformationisabletoberepresentedlinguisticallyandbesemanticallylabelledtoappearinthefinaltext.Thus,thecontentdeterminationisthecoreofbridgingfromnumericaldatatothesemanticrepresentationsinNLGsystems.Arecentsurveyonthetaskofcontentdeterminationandcontentselectioncanbefoundin[103].

Table2.1:TypicalmodulesandtasksofanNLGsystem[185].

ModulesContenttaskStructuretask

DocumentPlanningContentdeterminationDocumentstructuringMicroplanning

Lixicalisation;Referringexpressiongeneration

Aggregation

RealisationLinguisticrealisationStructurerealisation


• Lexicalisation is the task to decide what specific words should be selectedto express the determined content. For example, the actual nouns, verbs,adjectives and adverbs to appear in the text are chosen from a lexicon.Lexicalisation determines the particular linguistic terms to be used to ex-plain the domain concepts and their relations. In an NLG system, the roleof lexicalisation is essential since the mapping between extracted numer-ical information to the predefined lexicon is not a trivial task.

• Referring expression generation is the task to decide what expressionsshould be used to refer to domain concepts and entities, in a way thatthe reader of the system recognises what the message refers to. So, refer-ring expression generation is about selecting proper names and reason-able pronouns, or providing a sufficient set of descriptions for an entityor object in order to distinguish that entity from the rest.

• Linguistic realisation is the task to apply grammatical rules of the targetlanguage considering both morphology (the study of word forms) andsyntax (the study of sentence structure). This task converts the providedabstract symbolic representations of the messages into the actual finalsentences in natural language.

An important class of NLG frameworks is data-to-text systems, wherein alinguistic summarisation of numeric data is produced with the help of data min-ing and AI algorithms. The main architecture of data-to-text systems has beenintroduced by Reiter [182] that includes the following stages: signal analysis,data interpretation, document planning, microplanning and realisation (Fig-ure 2.2). These systems identify and abstract the patterns of the numeric data,determine the most useful and relevant information, and generate a naturallanguage text for the acquired knowledge in an understandable form [119].

A complete survey on reviewing NLG tasks, architectures, and applica-tions has recently been provided by Gatt and Krahmer [93]. According to[93], the developed NLG systems can be divided into three approaches: rule-based (modular), planning-based, and data-driven approaches. Rule-based ap-proaches consider a crisp division among the NLG tasks. These methods usu-ally follow a set of pre-defined rules in order to perform each of sub-taskswithin the NLG architecture. Planning approaches aim to look at the goal ofgenerating text from data as the process of determining a sequence of actions.These approaches combine slightly the sub-tasks of NLG to generate text. Data-driven approaches are the new dominant trend in NLG, which consider thegoal of text generation as an integrated task. Data-driven approaches performthis goal usually by applying statistical learning on the alignment of the inputnon-linguistic information and the output texts. This automatic learning of thecorrespondences between data and text make these approaches data-driven.Several pieces of research have been done from this perspective, especially byfocusing on the neural network and deep learning methods (a wide review can


•Lexicalisationisthetasktodecidewhatspecificwordsshouldbeselectedtoexpressthedeterminedcontent.Forexample,theactualnouns,verbs,adjectivesandadverbstoappearinthetextarechosenfromalexicon.Lexicalisationdeterminestheparticularlinguistictermstobeusedtoex-plainthedomainconceptsandtheirrelations.InanNLGsystem,theroleoflexicalisationisessentialsincethemappingbetweenextractednumer-icalinformationtothepredefinedlexiconisnotatrivialtask.

•Referringexpressiongenerationisthetasktodecidewhatexpressionsshouldbeusedtorefertodomainconceptsandentities,inawaythatthereaderofthesystemrecogniseswhatthemessagerefersto.So,refer-ringexpressiongenerationisaboutselectingpropernamesandreason-ablepronouns,orprovidingasufficientsetofdescriptionsforanentityorobjectinordertodistinguishthatentityfromtherest.

•Linguisticrealisationisthetasktoapplygrammaticalrulesofthetargetlanguageconsideringbothmorphology(thestudyofwordforms)andsyntax(thestudyofsentencestructure).Thistaskconvertstheprovidedabstractsymbolicrepresentationsofthemessagesintotheactualfinalsentencesinnaturallanguage.

AnimportantclassofNLGframeworksisdata-to-textsystems,whereinalinguisticsummarisationofnumericdataisproducedwiththehelpofdatamin-ingandAIalgorithms.Themainarchitectureofdata-to-textsystemshasbeenintroducedbyReiter[182]thatincludesthefollowingstages:signalanalysis,datainterpretation,documentplanning,microplanningandrealisation(Fig-ure2.2).Thesesystemsidentifyandabstractthepatternsofthenumericdata,determinethemostusefulandrelevantinformation,andgenerateanaturallanguagetextfortheacquiredknowledgeinanunderstandableform[119].

AcompletesurveyonreviewingNLGtasks,architectures,andapplica-tionshasrecentlybeenprovidedbyGattandKrahmer[93].Accordingto[93],thedevelopedNLGsystemscanbedividedintothreeapproaches:rule-based(modular),planning-based,anddata-drivenapproaches.Rule-basedap-proachesconsideracrispdivisionamongtheNLGtasks.Thesemethodsusu-allyfollowasetofpre-definedrulesinordertoperformeachofsub-taskswithintheNLGarchitecture.Planningapproachesaimtolookatthegoalofgeneratingtextfromdataastheprocessofdeterminingasequenceofactions.Theseapproachescombineslightlythesub-tasksofNLGtogeneratetext.Data-drivenapproachesarethenewdominanttrendinNLG,whichconsiderthegoaloftextgenerationasanintegratedtask.Data-drivenapproachesperformthisgoalusuallybyapplyingstatisticallearningonthealignmentoftheinputnon-linguisticinformationandtheoutputtexts.Thisautomaticlearningofthecorrespondencesbetweendataandtextmaketheseapproachesdata-driven.Severalpiecesofresearchhavebeendonefromthisperspective,especiallybyfocusingontheneuralnetworkanddeeplearningmethods(awidereviewcan


• Lexicalisation is the task to decide what specific words should be selectedto express the determined content. For example, the actual nouns, verbs,adjectives and adverbs to appear in the text are chosen from a lexicon.Lexicalisation determines the particular linguistic terms to be used to ex-plain the domain concepts and their relations. In an NLG system, the roleof lexicalisation is essential since the mapping between extracted numer-ical information to the predefined lexicon is not a trivial task.

• Referring expression generation is the task to decide what expressionsshould be used to refer to domain concepts and entities, in a way thatthe reader of the system recognises what the message refers to. So, refer-ring expression generation is about selecting proper names and reason-able pronouns, or providing a sufficient set of descriptions for an entityor object in order to distinguish that entity from the rest.

• Linguistic realisation is the task to apply grammatical rules of the targetlanguage considering both morphology (the study of word forms) andsyntax (the study of sentence structure). This task converts the providedabstract symbolic representations of the messages into the actual finalsentences in natural language.

An important class of NLG frameworks is data-to-text systems, wherein alinguistic summarisation of numeric data is produced with the help of data min-ing and AI algorithms. The main architecture of data-to-text systems has beenintroduced by Reiter [182] that includes the following stages: signal analysis,data interpretation, document planning, microplanning and realisation (Fig-ure 2.2). These systems identify and abstract the patterns of the numeric data,determine the most useful and relevant information, and generate a naturallanguage text for the acquired knowledge in an understandable form [119].

A complete survey on reviewing NLG tasks, architectures, and applica-tions has recently been provided by Gatt and Krahmer [93]. According to[93], the developed NLG systems can be divided into three approaches: rule-based (modular), planning-based, and data-driven approaches. Rule-based ap-proaches consider a crisp division among the NLG tasks. These methods usu-ally follow a set of pre-defined rules in order to perform each of sub-taskswithin the NLG architecture. Planning approaches aim to look at the goal ofgenerating text from data as the process of determining a sequence of actions.These approaches combine slightly the sub-tasks of NLG to generate text. Data-driven approaches are the new dominant trend in NLG, which consider thegoal of text generation as an integrated task. Data-driven approaches performthis goal usually by applying statistical learning on the alignment of the inputnon-linguistic information and the output texts. This automatic learning of thecorrespondences between data and text make these approaches data-driven.Several pieces of research have been done from this perspective, especially byfocusing on the neural network and deep learning methods (a wide review can


•Lexicalisationisthetasktodecidewhatspecificwordsshouldbeselectedtoexpressthedeterminedcontent.Forexample,theactualnouns,verbs,adjectivesandadverbstoappearinthetextarechosenfromalexicon.Lexicalisationdeterminestheparticularlinguistictermstobeusedtoex-plainthedomainconceptsandtheirrelations.InanNLGsystem,theroleoflexicalisationisessentialsincethemappingbetweenextractednumer-icalinformationtothepredefinedlexiconisnotatrivialtask.

•Referringexpressiongenerationisthetasktodecidewhatexpressionsshouldbeusedtorefertodomainconceptsandentities,inawaythatthereaderofthesystemrecogniseswhatthemessagerefersto.So,refer-ringexpressiongenerationisaboutselectingpropernamesandreason-ablepronouns,orprovidingasufficientsetofdescriptionsforanentityorobjectinordertodistinguishthatentityfromtherest.

•Linguisticrealisationisthetasktoapplygrammaticalrulesofthetargetlanguageconsideringbothmorphology(thestudyofwordforms)andsyntax(thestudyofsentencestructure).Thistaskconvertstheprovidedabstractsymbolicrepresentationsofthemessagesintotheactualfinalsentencesinnaturallanguage.

AnimportantclassofNLGframeworksisdata-to-textsystems,whereinalinguisticsummarisationofnumericdataisproducedwiththehelpofdatamin-ingandAIalgorithms.Themainarchitectureofdata-to-textsystemshasbeenintroducedbyReiter[182]thatincludesthefollowingstages:signalanalysis,datainterpretation,documentplanning,microplanningandrealisation(Fig-ure2.2).Thesesystemsidentifyandabstractthepatternsofthenumericdata,determinethemostusefulandrelevantinformation,andgenerateanaturallanguagetextfortheacquiredknowledgeinanunderstandableform[119].

AcompletesurveyonreviewingNLGtasks,architectures,andapplica-tionshasrecentlybeenprovidedbyGattandKrahmer[93].Accordingto[93],thedevelopedNLGsystemscanbedividedintothreeapproaches:rule-based(modular),planning-based,anddata-drivenapproaches.Rule-basedap-proachesconsideracrispdivisionamongtheNLGtasks.Thesemethodsusu-allyfollowasetofpre-definedrulesinordertoperformeachofsub-taskswithintheNLGarchitecture.Planningapproachesaimtolookatthegoalofgeneratingtextfromdataastheprocessofdeterminingasequenceofactions.Theseapproachescombineslightlythesub-tasksofNLGtogeneratetext.Data-drivenapproachesarethenewdominanttrendinNLG,whichconsiderthegoaloftextgenerationasanintegratedtask.Data-drivenapproachesperformthisgoalusuallybyapplyingstatisticallearningonthealignmentoftheinputnon-linguisticinformationandtheoutputtexts.Thisautomaticlearningofthecorrespondencesbetweendataandtextmaketheseapproachesdata-driven.Severalpiecesofresearchhavebeendonefromthisperspective,especiallybyfocusingontheneuralnetworkanddeeplearningmethods(awidereviewcan


Figure 2.2: The architecture of data-to-text systems, proposed by Reiter [182].

be found in [93]). However, the remaining question is how much the learningprocess is dependent on the goodness of provided set of training information(aligned data and text).

Recently, there is a growing demand for NLG and data-to-text systems inreal-world applications. Examples of well established NLG applications includethe generation of weather forecasting reports from meteorological data [104,186, 215], communicating financial and statistical information [79, 120]. BT-Nurse [119] and BabyTalk [94, 174] are the recent examples of data-to-textsystems in order to generate documents in medical domains which producesummaries of data sets about the state of neonatal babies from intensive caredata. Most of these applications use small amounts of homogeneous data andare supported by a significant amount of predefined knowledge.

Knowledge Acquisition for Content Determination

Knowledge acquisition (KA) is an essential part of building natural languagegeneration systems. Two types of KA techniques including corpus-based KAand structured expert-oriented KA have been previously studied for NLG sys-tems in [187]. These techniques aim to enrich the similarities between generatedtexts and natural human-written texts. Both of the techniques use rule-basedmethods to improve the quality of the acquired knowledge for such systems.

Export-oriented techniques use experts of the domain to acquire knowledgein a structured manner (e.g., direct interviews, questionnaires). As an example


Figure2.2:Thearchitectureofdata-to-textsystems,proposedbyReiter[182].

befoundin[93]).However,theremainingquestionishowmuchthelearningprocessisdependentonthegoodnessofprovidedsetoftraininginformation(aligneddataandtext).

Recently,thereisagrowingdemandforNLGanddata-to-textsystemsinreal-worldapplications.ExamplesofwellestablishedNLGapplicationsincludethegenerationofweatherforecastingreportsfrommeteorologicaldata[104,186,215],communicatingfinancialandstatisticalinformation[79,120].BT-Nurse[119]andBabyTalk[94,174]aretherecentexamplesofdata-to-textsystemsinordertogeneratedocumentsinmedicaldomainswhichproducesummariesofdatasetsaboutthestateofneonatalbabiesfromintensivecaredata.Mostoftheseapplicationsusesmallamountsofhomogeneousdataandaresupportedbyasignificantamountofpredefinedknowledge.

KnowledgeAcquisitionforContentDetermination

Knowledgeacquisition(KA)isanessentialpartofbuildingnaturallanguagegenerationsystems.TwotypesofKAtechniquesincludingcorpus-basedKAandstructuredexpert-orientedKAhavebeenpreviouslystudiedforNLGsys-temsin[187].Thesetechniquesaimtoenrichthesimilaritiesbetweengeneratedtextsandnaturalhuman-writtentexts.Bothofthetechniquesuserule-basedmethodstoimprovethequalityoftheacquiredknowledgeforsuchsystems.

Export-orientedtechniquesuseexpertsofthedomaintoacquireknowledgeinastructuredmanner(e.g.,directinterviews,questionnaires).Asanexample


Figure 2.2: The architecture of data-to-text systems, proposed by Reiter [182].

be found in [93]). However, the remaining question is how much the learningprocess is dependent on the goodness of provided set of training information(aligned data and text).

Recently, there is a growing demand for NLG and data-to-text systems inreal-world applications. Examples of well established NLG applications includethe generation of weather forecasting reports from meteorological data [104,186, 215], communicating financial and statistical information [79, 120]. BT-Nurse [119] and BabyTalk [94, 174] are the recent examples of data-to-textsystems in order to generate documents in medical domains which producesummaries of data sets about the state of neonatal babies from intensive caredata. Most of these applications use small amounts of homogeneous data andare supported by a significant amount of predefined knowledge.

Knowledge Acquisition for Content Determination

Knowledge acquisition (KA) is an essential part of building natural languagegeneration systems. Two types of KA techniques including corpus-based KAand structured expert-oriented KA have been previously studied for NLG sys-tems in [187]. These techniques aim to enrich the similarities between generatedtexts and natural human-written texts. Both of the techniques use rule-basedmethods to improve the quality of the acquired knowledge for such systems.

Export-oriented techniques use experts of the domain to acquire knowledgein a structured manner (e.g., direct interviews, questionnaires). As an example


Figure2.2:Thearchitectureofdata-to-textsystems,proposedbyReiter[182].

befoundin[93]).However,theremainingquestionishowmuchthelearningprocessisdependentonthegoodnessofprovidedsetoftraininginformation(aligneddataandtext).

Recently,thereisagrowingdemandforNLGanddata-to-textsystemsinreal-worldapplications.ExamplesofwellestablishedNLGapplicationsincludethegenerationofweatherforecastingreportsfrommeteorologicaldata[104,186,215],communicatingfinancialandstatisticalinformation[79,120].BT-Nurse[119]andBabyTalk[94,174]aretherecentexamplesofdata-to-textsystemsinordertogeneratedocumentsinmedicaldomainswhichproducesummariesofdatasetsaboutthestateofneonatalbabiesfromintensivecaredata.Mostoftheseapplicationsusesmallamountsofhomogeneousdataandaresupportedbyasignificantamountofpredefinedknowledge.

KnowledgeAcquisitionforContentDetermination

Knowledgeacquisition(KA)isanessentialpartofbuildingnaturallanguagegenerationsystems.TwotypesofKAtechniquesincludingcorpus-basedKAandstructuredexpert-orientedKAhavebeenpreviouslystudiedforNLGsys-temsin[187].Thesetechniquesaimtoenrichthesimilaritiesbetweengeneratedtextsandnaturalhuman-writtentexts.Bothofthetechniquesuserule-basedmethodstoimprovethequalityoftheacquiredknowledgeforsuchsystems.

Export-orientedtechniquesuseexpertsofthedomaintoacquireknowledgeinastructuredmanner(e.g.,directinterviews,questionnaires).Asanexample


in the medical area, one can ask doctors or nurses about the necessary informa-tion that are needed to be captured and presented in the final generated text.Corpus-based techniques rely on the analysis of available corpus, and learnfrom provided sources of information such as data sets themselves. For ex-ample, one can use previous corpora of human-written reports (e.g., clinicalreports) to acquire the necessary elements and information, to reproduce thesame knowledge in the final generated text.

In data-to-text systems, the acquired knowledge, also called domain knowl-edge, is usually organised in the form of taxonomies or ontologies of infor-mation. All the stages of a data-to-text architecture (shown in Figure 2.2)then use these provided taxonomies of knowledge to perform their tasks. Inparticular, signal analysis stage extracts information that is determined in tax-onomies such as simple patterns, events, and trends. Also, the data interpreta-tion stage abstracts information into symbolic messages using the representedtaxonomies. The most recent data-to-text frameworks developed by Reiter’sarchitecture [182] have acquired the taxonomies or ontologies correspondingto the domain knowledge. For instance, the work on summarising gas turbinetime series [237] has used expert knowledge to provide a taxonomy of theprimitive patterns (i.e., spikes, steps, oscillations). Similarly, systems related tothe BabyTalk project [94, 174] have stored medically known observation (e.g.,bradycardia) in local ontologies. The core of such systems is based on the rich-ness of the domain knowledge in the provided taxonomies which are usuallybounded by expert rules. This organised domain knowledge is usually an in-flexible input to the framework which restricts the output of the stages in data-to-text architectures. For instance, the taxonomy in [237] does not allow thesystem to represent unexpected observations (e.g., wave or burst) out of thepredefined domain knowledge. Likewise, in the medical domain, an unknownphysiological pattern will be ignored if it does not have a corresponding entityin the provided ontology by the domain experts.

Determining suitable content is mostly addressed using knowledge-drivenand rule-based approaches to comply with the domain or user requirements [102,237]. However, there is a new trend of data-driven approaches to performingsuch task without relying on a set of pre-defined knowledge to map data to lex-icons. Instead, these approaches aim to use learning techniques, such as hiddenMarkov models [33] and optimisation methods [133], in order to automaticallyground language acquisition and align numerical observation to their properdescriptions [93]. These approaches are independent of rules, however, theyare still dependent on a set of well-defined correspondences between input dataand the output text in a supervised manner. The way that this thesis aims to dothe data-driven content determination is to rely on the observed data and itsbehaviours that can be beyond the user requirements, though still meaningfulto represent [31].

Although some studies in NLG, like SumTime [186], claim that building acomplete data-to-text system is good enough to be used for producing text as


inthemedicalarea,onecanaskdoctorsornursesaboutthenecessaryinforma-tionthatareneededtobecapturedandpresentedinthefinalgeneratedtext.Corpus-basedtechniquesrelyontheanalysisofavailablecorpus,andlearnfromprovidedsourcesofinformationsuchasdatasetsthemselves.Forex-ample,onecanusepreviouscorporaofhuman-writtenreports(e.g.,clinicalreports)toacquirethenecessaryelementsandinformation,toreproducethesameknowledgeinthefinalgeneratedtext.

Indata-to-textsystems,theacquiredknowledge,alsocalleddomainknowl-edge,isusuallyorganisedintheformoftaxonomiesorontologiesofinfor-mation.Allthestagesofadata-to-textarchitecture(showninFigure2.2)thenusetheseprovidedtaxonomiesofknowledgetoperformtheirtasks.Inparticular,signalanalysisstageextractsinformationthatisdeterminedintax-onomiessuchassimplepatterns,events,andtrends.Also,thedatainterpreta-tionstageabstractsinformationintosymbolicmessagesusingtherepresentedtaxonomies.Themostrecentdata-to-textframeworksdevelopedbyReiter’sarchitecture[182]haveacquiredthetaxonomiesorontologiescorrespondingtothedomainknowledge.Forinstance,theworkonsummarisinggasturbinetimeseries[237]hasusedexpertknowledgetoprovideataxonomyoftheprimitivepatterns(i.e.,spikes,steps,oscillations).Similarly,systemsrelatedtotheBabyTalkproject[94,174]havestoredmedicallyknownobservation(e.g.,bradycardia)inlocalontologies.Thecoreofsuchsystemsisbasedontherich-nessofthedomainknowledgeintheprovidedtaxonomieswhichareusuallyboundedbyexpertrules.Thisorganiseddomainknowledgeisusuallyanin-flexibleinputtotheframeworkwhichrestrictstheoutputofthestagesindata-to-textarchitectures.Forinstance,thetaxonomyin[237]doesnotallowthesystemtorepresentunexpectedobservations(e.g.,waveorburst)outofthepredefineddomainknowledge.Likewise,inthemedicaldomain,anunknownphysiologicalpatternwillbeignoredifitdoesnothaveacorrespondingentityintheprovidedontologybythedomainexperts.

Determiningsuitablecontentismostlyaddressedusingknowledge-drivenandrule-basedapproachestocomplywiththedomainoruserrequirements[102,237].However,thereisanewtrendofdata-drivenapproachestoperformingsuchtaskwithoutrelyingonasetofpre-definedknowledgetomapdatatolex-icons.Instead,theseapproachesaimtouselearningtechniques,suchashiddenMarkovmodels[33]andoptimisationmethods[133],inordertoautomaticallygroundlanguageacquisitionandalignnumericalobservationtotheirproperdescriptions[93].Theseapproachesareindependentofrules,however,theyarestilldependentonasetofwell-definedcorrespondencesbetweeninputdataandtheoutputtextinasupervisedmanner.Thewaythatthisthesisaimstodothedata-drivencontentdeterminationistorelyontheobserveddataanditsbehavioursthatcanbebeyondtheuserrequirements,thoughstillmeaningfultorepresent[31].

AlthoughsomestudiesinNLG,likeSumTime[186],claimthatbuildingacompletedata-to-textsystemisgoodenoughtobeusedforproducingtextas


in the medical area, one can ask doctors or nurses about the necessary informa-tion that are needed to be captured and presented in the final generated text.Corpus-based techniques rely on the analysis of available corpus, and learnfrom provided sources of information such as data sets themselves. For ex-ample, one can use previous corpora of human-written reports (e.g., clinicalreports) to acquire the necessary elements and information, to reproduce thesame knowledge in the final generated text.

In data-to-text systems, the acquired knowledge, also called domain knowl-edge, is usually organised in the form of taxonomies or ontologies of infor-mation. All the stages of a data-to-text architecture (shown in Figure 2.2)then use these provided taxonomies of knowledge to perform their tasks. Inparticular, signal analysis stage extracts information that is determined in tax-onomies such as simple patterns, events, and trends. Also, the data interpreta-tion stage abstracts information into symbolic messages using the representedtaxonomies. The most recent data-to-text frameworks developed by Reiter’sarchitecture [182] have acquired the taxonomies or ontologies correspondingto the domain knowledge. For instance, the work on summarising gas turbinetime series [237] has used expert knowledge to provide a taxonomy of theprimitive patterns (i.e., spikes, steps, oscillations). Similarly, systems related tothe BabyTalk project [94, 174] have stored medically known observation (e.g.,bradycardia) in local ontologies. The core of such systems is based on the rich-ness of the domain knowledge in the provided taxonomies which are usuallybounded by expert rules. This organised domain knowledge is usually an in-flexible input to the framework which restricts the output of the stages in data-to-text architectures. For instance, the taxonomy in [237] does not allow thesystem to represent unexpected observations (e.g., wave or burst) out of thepredefined domain knowledge. Likewise, in the medical domain, an unknownphysiological pattern will be ignored if it does not have a corresponding entityin the provided ontology by the domain experts.

Determining suitable content is mostly addressed using knowledge-drivenand rule-based approaches to comply with the domain or user requirements [102,237]. However, there is a new trend of data-driven approaches to performingsuch task without relying on a set of pre-defined knowledge to map data to lex-icons. Instead, these approaches aim to use learning techniques, such as hiddenMarkov models [33] and optimisation methods [133], in order to automaticallyground language acquisition and align numerical observation to their properdescriptions [93]. These approaches are independent of rules, however, theyare still dependent on a set of well-defined correspondences between input dataand the output text in a supervised manner. The way that this thesis aims to dothe data-driven content determination is to rely on the observed data and itsbehaviours that can be beyond the user requirements, though still meaningfulto represent [31].

Although some studies in NLG, like SumTime [186], claim that building acomplete data-to-text system is good enough to be used for producing text as


inthemedicalarea,onecanaskdoctorsornursesaboutthenecessaryinforma-tionthatareneededtobecapturedandpresentedinthefinalgeneratedtext.Corpus-basedtechniquesrelyontheanalysisofavailablecorpus,andlearnfromprovidedsourcesofinformationsuchasdatasetsthemselves.Forex-ample,onecanusepreviouscorporaofhuman-writtenreports(e.g.,clinicalreports)toacquirethenecessaryelementsandinformation,toreproducethesameknowledgeinthefinalgeneratedtext.

Indata-to-textsystems,theacquiredknowledge,alsocalleddomainknowl-edge,isusuallyorganisedintheformoftaxonomiesorontologiesofinfor-mation.Allthestagesofadata-to-textarchitecture(showninFigure2.2)thenusetheseprovidedtaxonomiesofknowledgetoperformtheirtasks.Inparticular,signalanalysisstageextractsinformationthatisdeterminedintax-onomiessuchassimplepatterns,events,andtrends.Also,thedatainterpreta-tionstageabstractsinformationintosymbolicmessagesusingtherepresentedtaxonomies.Themostrecentdata-to-textframeworksdevelopedbyReiter’sarchitecture[182]haveacquiredthetaxonomiesorontologiescorrespondingtothedomainknowledge.Forinstance,theworkonsummarisinggasturbinetimeseries[237]hasusedexpertknowledgetoprovideataxonomyoftheprimitivepatterns(i.e.,spikes,steps,oscillations).Similarly,systemsrelatedtotheBabyTalkproject[94,174]havestoredmedicallyknownobservation(e.g.,bradycardia)inlocalontologies.Thecoreofsuchsystemsisbasedontherich-nessofthedomainknowledgeintheprovidedtaxonomieswhichareusuallyboundedbyexpertrules.Thisorganiseddomainknowledgeisusuallyanin-flexibleinputtotheframeworkwhichrestrictstheoutputofthestagesindata-to-textarchitectures.Forinstance,thetaxonomyin[237]doesnotallowthesystemtorepresentunexpectedobservations(e.g.,waveorburst)outofthepredefineddomainknowledge.Likewise,inthemedicaldomain,anunknownphysiologicalpatternwillbeignoredifitdoesnothaveacorrespondingentityintheprovidedontologybythedomainexperts.

Determiningsuitablecontentismostlyaddressedusingknowledge-drivenandrule-basedapproachestocomplywiththedomainoruserrequirements[102,237].However,thereisanewtrendofdata-drivenapproachestoperformingsuchtaskwithoutrelyingonasetofpre-definedknowledgetomapdatatolex-icons.Instead,theseapproachesaimtouselearningtechniques,suchashiddenMarkovmodels[33]andoptimisationmethods[133],inordertoautomaticallygroundlanguageacquisitionandalignnumericalobservationtotheirproperdescriptions[93].Theseapproachesareindependentofrules,however,theyarestilldependentonasetofwell-definedcorrespondencesbetweeninputdataandtheoutputtextinasupervisedmanner.Thewaythatthisthesisaimstodothedata-drivencontentdeterminationistorelyontheobserveddataanditsbehavioursthatcanbebeyondtheuserrequirements,thoughstillmeaningfultorepresent[31].

AlthoughsomestudiesinNLG,likeSumTime[186],claimthatbuildingacompletedata-to-textsystemisgoodenoughtobeusedforproducingtextas

2.4. CONCLUSIONS 33

good as a human does, there is still the lack of modelling cognitive task in NLGsystems to study e.g., how humans model observations in their mind, and thenexplain them by linguistic terms. This issue in rule-based data-to-text systemsreveals the necessity of reorganising the way of modelling domain knowledgein order to also cover explaining unseen information across the data.

2.4 Conclusions

A semantic representation of numerical data is the key point to bridge betweenconceptual spaces theory and linguistic description approaches by modellingdescriptive features of data. On one hand, concept formation based on per-ceived information in a conceptual space relies on the descriptive features inorder to determine the domains and quality dimensions of the spaces. Thisdata-driven manner of modelling features leads to a semantic representation ofnon-linguistic information in order to infer meaningful descriptions. Importingdescriptive features to computational systems includes the possibility of oper-ating with linguistic information [35]. On the other hand, deriving linguisticdescriptions for a set of numerical perceived data is dependent on the semanticlevel of its descriptive features. This set of information can be obtained fromvarious sources such as observations, sensor measurements, mathematical anal-ysis or visual perceptions [35]. Since conceptual spaces have been developedto model the descriptive features of concepts for further reasoning, it can beemployed as a robust framework to perform content determination task in lin-guistic descriptions using semantic inferences.

Regarding the relation of these two fields in the literature, the problem ofmodelling natural language using conceptual spaces has been investigated ina few isolated works [10, 72]. Evidently, in most of the proposed conceptualrepresentations, the primary interest is not the relation of concepts and naturallanguage [108]. Aisbett et al., [15] recently investigated the integration of con-ceptual spaces theory with the topic of computing with words by introducing afuzzy representation of conceptual spaces’ elements. Domains and dimensionsin their work, however, are crisp elements with no role concerning the qualifi-cation of objects within the space. Also, [72] attempted to derive the semanticrelations within conceptual spaces built upon text documents. However, to thebest of our knowledge, there is no study on the conceptual spaces to derivenatural language descriptions for the numeric inputs through a conceptual rep-resentation.

As the final point to conclude this chapter, it worth mentioning that theproposed approach is called semantic representation (and not only conceptualrepresentation), because it goes beyond representing the concepts as structuresin mind. The proposed approach links the perceived information to naturallanguage by “linking concepts to meaning” [111] using semantic inferences.

2.4.CONCLUSIONS33

goodasahumandoes,thereisstillthelackofmodellingcognitivetaskinNLGsystemstostudye.g.,howhumansmodelobservationsintheirmind,andthenexplainthembylinguisticterms.Thisissueinrule-baseddata-to-textsystemsrevealsthenecessityofreorganisingthewayofmodellingdomainknowledgeinordertoalsocoverexplainingunseeninformationacrossthedata.

2.4Conclusions

Asemanticrepresentationofnumericaldataisthekeypointtobridgebetweenconceptualspacestheoryandlinguisticdescriptionapproachesbymodellingdescriptivefeaturesofdata.Ononehand,conceptformationbasedonper-ceivedinformationinaconceptualspacereliesonthedescriptivefeaturesinordertodeterminethedomainsandqualitydimensionsofthespaces.Thisdata-drivenmannerofmodellingfeaturesleadstoasemanticrepresentationofnon-linguisticinformationinordertoinfermeaningfuldescriptions.Importingdescriptivefeaturestocomputationalsystemsincludesthepossibilityofoper-atingwithlinguisticinformation[35].Ontheotherhand,derivinglinguisticdescriptionsforasetofnumericalperceiveddataisdependentonthesemanticlevelofitsdescriptivefeatures.Thissetofinformationcanbeobtainedfromvarioussourcessuchasobservations,sensormeasurements,mathematicalanal-ysisorvisualperceptions[35].Sinceconceptualspaceshavebeendevelopedtomodelthedescriptivefeaturesofconceptsforfurtherreasoning,itcanbeemployedasarobustframeworktoperformcontentdeterminationtaskinlin-guisticdescriptionsusingsemanticinferences.

Regardingtherelationofthesetwofieldsintheliterature,theproblemofmodellingnaturallanguageusingconceptualspaceshasbeeninvestigatedinafewisolatedworks[10,72].Evidently,inmostoftheproposedconceptualrepresentations,theprimaryinterestisnottherelationofconceptsandnaturallanguage[108].Aisbettetal.,[15]recentlyinvestigatedtheintegrationofcon-ceptualspacestheorywiththetopicofcomputingwithwordsbyintroducingafuzzyrepresentationofconceptualspaces’elements.Domainsanddimensionsintheirwork,however,arecrispelementswithnoroleconcerningthequalifi-cationofobjectswithinthespace.Also,[72]attemptedtoderivethesemanticrelationswithinconceptualspacesbuiltupontextdocuments.However,tothebestofourknowledge,thereisnostudyontheconceptualspacestoderivenaturallanguagedescriptionsforthenumericinputsthroughaconceptualrep-resentation.

Asthefinalpointtoconcludethischapter,itworthmentioningthattheproposedapproachiscalledsemanticrepresentation(andnotonlyconceptualrepresentation),becauseitgoesbeyondrepresentingtheconceptsasstructuresinmind.Theproposedapproachlinkstheperceivedinformationtonaturallanguageby“linkingconceptstomeaning”[111]usingsemanticinferences.

2.4. CONCLUSIONS 33

good as a human does, there is still the lack of modelling cognitive task in NLGsystems to study e.g., how humans model observations in their mind, and thenexplain them by linguistic terms. This issue in rule-based data-to-text systemsreveals the necessity of reorganising the way of modelling domain knowledgein order to also cover explaining unseen information across the data.

2.4 Conclusions

A semantic representation of numerical data is the key point to bridge betweenconceptual spaces theory and linguistic description approaches by modellingdescriptive features of data. On one hand, concept formation based on per-ceived information in a conceptual space relies on the descriptive features inorder to determine the domains and quality dimensions of the spaces. Thisdata-driven manner of modelling features leads to a semantic representation ofnon-linguistic information in order to infer meaningful descriptions. Importingdescriptive features to computational systems includes the possibility of oper-ating with linguistic information [35]. On the other hand, deriving linguisticdescriptions for a set of numerical perceived data is dependent on the semanticlevel of its descriptive features. This set of information can be obtained fromvarious sources such as observations, sensor measurements, mathematical anal-ysis or visual perceptions [35]. Since conceptual spaces have been developedto model the descriptive features of concepts for further reasoning, it can beemployed as a robust framework to perform content determination task in lin-guistic descriptions using semantic inferences.

Regarding the relation of these two fields in the literature, the problem ofmodelling natural language using conceptual spaces has been investigated ina few isolated works [10, 72]. Evidently, in most of the proposed conceptualrepresentations, the primary interest is not the relation of concepts and naturallanguage [108]. Aisbett et al., [15] recently investigated the integration of con-ceptual spaces theory with the topic of computing with words by introducing afuzzy representation of conceptual spaces’ elements. Domains and dimensionsin their work, however, are crisp elements with no role concerning the qualifi-cation of objects within the space. Also, [72] attempted to derive the semanticrelations within conceptual spaces built upon text documents. However, to thebest of our knowledge, there is no study on the conceptual spaces to derivenatural language descriptions for the numeric inputs through a conceptual rep-resentation.

As the final point to conclude this chapter, it worth mentioning that theproposed approach is called semantic representation (and not only conceptualrepresentation), because it goes beyond representing the concepts as structuresin mind. The proposed approach links the perceived information to naturallanguage by “linking concepts to meaning” [111] using semantic inferences.

2.4.CONCLUSIONS33

goodasahumandoes,thereisstillthelackofmodellingcognitivetaskinNLGsystemstostudye.g.,howhumansmodelobservationsintheirmind,andthenexplainthembylinguisticterms.Thisissueinrule-baseddata-to-textsystemsrevealsthenecessityofreorganisingthewayofmodellingdomainknowledgeinordertoalsocoverexplainingunseeninformationacrossthedata.

2.4Conclusions

Asemanticrepresentationofnumericaldataisthekeypointtobridgebetweenconceptualspacestheoryandlinguisticdescriptionapproachesbymodellingdescriptivefeaturesofdata.Ononehand,conceptformationbasedonper-ceivedinformationinaconceptualspacereliesonthedescriptivefeaturesinordertodeterminethedomainsandqualitydimensionsofthespaces.Thisdata-drivenmannerofmodellingfeaturesleadstoasemanticrepresentationofnon-linguisticinformationinordertoinfermeaningfuldescriptions.Importingdescriptivefeaturestocomputationalsystemsincludesthepossibilityofoper-atingwithlinguisticinformation[35].Ontheotherhand,derivinglinguisticdescriptionsforasetofnumericalperceiveddataisdependentonthesemanticlevelofitsdescriptivefeatures.Thissetofinformationcanbeobtainedfromvarioussourcessuchasobservations,sensormeasurements,mathematicalanal-ysisorvisualperceptions[35].Sinceconceptualspaceshavebeendevelopedtomodelthedescriptivefeaturesofconceptsforfurtherreasoning,itcanbeemployedasarobustframeworktoperformcontentdeterminationtaskinlin-guisticdescriptionsusingsemanticinferences.

Regardingtherelationofthesetwofieldsintheliterature,theproblemofmodellingnaturallanguageusingconceptualspaceshasbeeninvestigatedinafewisolatedworks[10,72].Evidently,inmostoftheproposedconceptualrepresentations,theprimaryinterestisnottherelationofconceptsandnaturallanguage[108].Aisbettetal.,[15]recentlyinvestigatedtheintegrationofcon-ceptualspacestheorywiththetopicofcomputingwithwordsbyintroducingafuzzyrepresentationofconceptualspaces’elements.Domainsanddimensionsintheirwork,however,arecrispelementswithnoroleconcerningthequalifi-cationofobjectswithinthespace.Also,[72]attemptedtoderivethesemanticrelationswithinconceptualspacesbuiltupontextdocuments.However,tothebestofourknowledge,thereisnostudyontheconceptualspacestoderivenaturallanguagedescriptionsforthenumericinputsthroughaconceptualrep-resentation.

Asthefinalpointtoconcludethischapter,itworthmentioningthattheproposedapproachiscalledsemanticrepresentation(andnotonlyconceptualrepresentation),becauseitgoesbeyondrepresentingtheconceptsasstructuresinmind.Theproposedapproachlinkstheperceivedinformationtonaturallanguageby“linkingconceptstomeaning”[111]usingsemanticinferences.

Chapter 3

Data-Driven Construction of

Conceptual Spaces

“Solving a problem simply means representing it so as tomake the solution transparent.”

— Herb Simon (1916–2001)

This chapter presents one of the leading contributions of this thesis, that ishow to automatically construct a conceptual space in a data-driven man-

ner from a numeric data set. The approach to constructing conceptual spaces isconsidered to be data-driven as it is automatically constructed by processing thedata matrix of the observations based on the variable values and class labels.This is in contrast with knowledge-driven approaches that have to be manu-ally constructed using psychologically or scientifically pre-defined knowledgeabout the relations between quality dimensions, domains and the concepts’ re-gions [10, 87]. Practically, the process of constructing a conceptual space isabout determining its essential elements. According to [14, 181], the definitionof a conceptual space is as follows:

Definition 3.1. A conceptual space S is defined as a 4-tuple 〈Q,Δ,C, Γ〉, whereQ is a set of quality dimensions, Δ is a set of domains, C is a set of concepts inthe space S, and Γ is a set of instances representing the concepts.

The representations of the elements are rigorously explained in further def-initions (from 3.2 to 3.5). To automate the process of constructing conceptualspaces, the definitions of the conceptual spaces’ elements are modified slightlycompared to previous formulations (e.g., [9,189]). These modifications are nec-essary to use the constructed conceptual space as a model of a semantic repre-sentation for the inference of unknown observations.

35

Chapter3

Data-DrivenConstructionof

ConceptualSpaces

“Solvingaproblemsimplymeansrepresentingitsoastomakethesolutiontransparent.”

—HerbSimon(1916–2001)

Thischapterpresentsoneoftheleadingcontributionsofthisthesis,thatishowtoautomaticallyconstructaconceptualspaceinadata-drivenman-

nerfromanumericdataset.Theapproachtoconstructingconceptualspacesisconsideredtobedata-drivenasitisautomaticallyconstructedbyprocessingthedatamatrixoftheobservationsbasedonthevariablevaluesandclasslabels.Thisisincontrastwithknowledge-drivenapproachesthathavetobemanu-allyconstructedusingpsychologicallyorscientificallypre-definedknowledgeabouttherelationsbetweenqualitydimensions,domainsandtheconcepts’re-gions[10,87].Practically,theprocessofconstructingaconceptualspaceisaboutdeterminingitsessentialelements.Accordingto[14,181],thedefinitionofaconceptualspaceisasfollows:

Definition3.1.AconceptualspaceSisdefinedasa4-tuple〈Q,Δ,C,Γ〉,whereQisasetofqualitydimensions,Δisasetofdomains,CisasetofconceptsinthespaceS,andΓisasetofinstancesrepresentingtheconcepts.

Therepresentationsoftheelementsarerigorouslyexplainedinfurtherdef-initions(from3.2to3.5).Toautomatetheprocessofconstructingconceptualspaces,thedefinitionsoftheconceptualspaces’elementsaremodifiedslightlycomparedtopreviousformulations(e.g.,[9,189]).Thesemodificationsarenec-essarytousetheconstructedconceptualspaceasamodelofasemanticrepre-sentationfortheinferenceofunknownobservations.

35

Chapter 3

Data-Driven Construction of

Conceptual Spaces

“Solving a problem simply means representing it so as tomake the solution transparent.”

— Herb Simon (1916–2001)

This chapter presents one of the leading contributions of this thesis, that ishow to automatically construct a conceptual space in a data-driven man-

ner from a numeric data set. The approach to constructing conceptual spaces isconsidered to be data-driven as it is automatically constructed by processing thedata matrix of the observations based on the variable values and class labels.This is in contrast with knowledge-driven approaches that have to be manu-ally constructed using psychologically or scientifically pre-defined knowledgeabout the relations between quality dimensions, domains and the concepts’ re-gions [10, 87]. Practically, the process of constructing a conceptual space isabout determining its essential elements. According to [14, 181], the definitionof a conceptual space is as follows:

Definition 3.1. A conceptual space S is defined as a 4-tuple 〈Q,Δ,C, Γ〉, whereQ is a set of quality dimensions, Δ is a set of domains, C is a set of concepts inthe space S, and Γ is a set of instances representing the concepts.

The representations of the elements are rigorously explained in further def-initions (from 3.2 to 3.5). To automate the process of constructing conceptualspaces, the definitions of the conceptual spaces’ elements are modified slightlycompared to previous formulations (e.g., [9,189]). These modifications are nec-essary to use the constructed conceptual space as a model of a semantic repre-sentation for the inference of unknown observations.

35

Chapter3

Data-DrivenConstructionof

ConceptualSpaces

“Solvingaproblemsimplymeansrepresentingitsoastomakethesolutiontransparent.”

—HerbSimon(1916–2001)

Thischapterpresentsoneoftheleadingcontributionsofthisthesis,thatishowtoautomaticallyconstructaconceptualspaceinadata-drivenman-

nerfromanumericdataset.Theapproachtoconstructingconceptualspacesisconsideredtobedata-drivenasitisautomaticallyconstructedbyprocessingthedatamatrixoftheobservationsbasedonthevariablevaluesandclasslabels.Thisisincontrastwithknowledge-drivenapproachesthathavetobemanu-allyconstructedusingpsychologicallyorscientificallypre-definedknowledgeabouttherelationsbetweenqualitydimensions,domainsandtheconcepts’re-gions[10,87].Practically,theprocessofconstructingaconceptualspaceisaboutdeterminingitsessentialelements.Accordingto[14,181],thedefinitionofaconceptualspaceisasfollows:

Definition3.1.AconceptualspaceSisdefinedasa4-tuple〈Q,Δ,C,Γ〉,whereQisasetofqualitydimensions,Δisasetofdomains,CisasetofconceptsinthespaceS,andΓisasetofinstancesrepresentingtheconcepts.

Therepresentationsoftheelementsarerigorouslyexplainedinfurtherdef-initions(from3.2to3.5).Toautomatetheprocessofconstructingconceptualspaces,thedefinitionsoftheconceptualspaces’elementsaremodifiedslightlycomparedtopreviousformulations(e.g.,[9,189]).Thesemodificationsarenec-essarytousetheconstructedconceptualspaceasamodelofasemanticrepre-sentationfortheinferenceofunknownobservations.

35

36 CHAPTER 3. CONSTRUCTION OF CONCEPTUAL SPACES

To begin, it is assumed that a given data set M contains a set of possible classlabels, a set of predefined features, and the input observations with known classlabels, which are characterised by feature values. Given a set of class labelsY = {y1, . . . ,ym} and a set of features F = {X1, . . . ,Xn}, let D be the set ofknown observations, denoted by D = {oi : (xoi

,yoi)}, where oi consists of a

n-dimensional feature vector xoi= [x1, . . . , xn], and an output label yoi

∈ Y.The component xj (j = 1, . . .n) in the vector xoi

is the measured value of thecorresponding feature Xj ∈ F.

Here, each feature X is defined as a couple of values X : 〈HX, IX〉, where HX

indicates the linguistic name of the feature, and IX is either a numeric intervalor a categorical set that presents the possible range of values for X.

Example 3.1. Consider the leaf data set [206] which is a set of photographedleaf samples (observation set Dl) from various plant species (classes) such as:Y = { yqr : ‘Quercus Robur’, yap : ‘Acer Palmatum’, yno : ‘Nerium Oleander’,ytt : ‘Tilia Tomentosa’, . . .}. This data set includes a set of measurablefeatures to characterise the features of each leaf sample, such as: F = {

Xel : elongation, Xlo : lobedness, Xco : convexity, Xro : roundness,Xso : solidity, Xin : indentation, . . .}. An observed leaf such as oi ∈ Dl

that is labelled by ytt takes the feature values as: oi : (xoi,ytt), where

xoi= [xel, xlo, xco, xro, xso, xin].

The goal described in this chapter is to find a mapping from the elements ofa data set M to various components needed to define a conceptual space S. Inshort, this mapping is achieved by performing the following steps:

• Initialise the primitive known concepts using the class labels. Conse-quently, the conceptual space S, which models the data set M, will consistof a set of concepts C = {C1, . . . ,Cm}, where |C| = |Y|. Thus, the notationCy indicates the concept which corresponds to the class label y ∈ Y.1

• Specify the quality dimensions Q and domains Δ. The quality dimensionsare specified via selecting a subset of the features such that Q ⊂ F, andthe domains are determined based on ranking and grouping the set ofselected features as the quality dimensions (Section 3.1).

• Form the representation of each concept Cy within the domains Δ, basedon the known corresponding instances (Section 3.2).

The following sections will outline how the two latter steps are achieved indetail. Figure 3.1 illustrates the steps of constructing a conceptual space from aset of numeric data, which are explained in the following sections.

1This approach follows the assumption that any semantic modelling needs an innate set ofknowledge [176]. Learning concepts without any initialised knowledge is not the scope of thiswork.

36CHAPTER3.CONSTRUCTIONOFCONCEPTUALSPACES

Tobegin,itisassumedthatagivendatasetMcontainsasetofpossibleclasslabels,asetofpredefinedfeatures,andtheinputobservationswithknownclasslabels,whicharecharacterisedbyfeaturevalues.GivenasetofclasslabelsY={y1,...,ym}andasetoffeaturesF={X1,...,Xn},letDbethesetofknownobservations,denotedbyD={oi:(xoi,yoi)},whereoiconsistsofan-dimensionalfeaturevectorxoi=[x1,...,xn],andanoutputlabelyoi∈Y.Thecomponentxj(j=1,...n)inthevectorxoiisthemeasuredvalueofthecorrespondingfeatureXj∈F.

Here,eachfeatureXisdefinedasacoupleofvaluesX:〈HX,IX〉,whereHX

indicatesthelinguisticnameofthefeature,andIXiseitheranumericintervaloracategoricalsetthatpresentsthepossiblerangeofvaluesforX.

Example3.1.Considertheleafdataset[206]whichisasetofphotographedleafsamples(observationsetD

l)fromvariousplantspecies(classes)suchas:

Y={yqr:‘QuercusRobur’,yap:‘AcerPalmatum’,yno:‘NeriumOleander’,ytt:‘TiliaTomentosa’,...}.Thisdatasetincludesasetofmeasurablefeaturestocharacterisethefeaturesofeachleafsample,suchas:F={

Xel:elongation,Xlo:lobedness,Xco:convexity,Xro:roundness,Xso:solidity,Xin:indentation,...}.Anobservedleafsuchasoi∈D

l

thatislabelledbyytttakesthefeaturevaluesas:oi:(xoi,ytt),wherexoi=[xel,xlo,xco,xro,xso,xin].

ThegoaldescribedinthischapteristofindamappingfromtheelementsofadatasetMtovariouscomponentsneededtodefineaconceptualspaceS.Inshort,thismappingisachievedbyperformingthefollowingsteps:

•Initialisetheprimitiveknownconceptsusingtheclasslabels.Conse-quently,theconceptualspaceS,whichmodelsthedatasetM,willconsistofasetofconceptsC={C1,...,Cm},where|C|=|Y|.Thus,thenotationCyindicatestheconceptwhichcorrespondstotheclasslabely∈Y.1

•SpecifythequalitydimensionsQanddomainsΔ.ThequalitydimensionsarespecifiedviaselectingasubsetofthefeaturessuchthatQ⊂F,andthedomainsaredeterminedbasedonrankingandgroupingthesetofselectedfeaturesasthequalitydimensions(Section3.1).

•FormtherepresentationofeachconceptCywithinthedomainsΔ,basedontheknowncorrespondinginstances(Section3.2).

Thefollowingsectionswilloutlinehowthetwolatterstepsareachievedindetail.Figure3.1illustratesthestepsofconstructingaconceptualspacefromasetofnumericdata,whichareexplainedinthefollowingsections.

1Thisapproachfollowstheassumptionthatanysemanticmodellingneedsaninnatesetofknowledge[176].Learningconceptswithoutanyinitialisedknowledgeisnotthescopeofthiswork.


To begin, it is assumed that a given data set M contains a set of possible classlabels, a set of predefined features, and the input observations with known classlabels, which are characterised by feature values. Given a set of class labelsY = {y1, . . . ,ym} and a set of features F = {X1, . . . ,Xn}, let D be the set ofknown observations, denoted by D = {oi : (xoi

,yoi)}, where oi consists of a

n-dimensional feature vector xoi= [x1, . . . , xn], and an output label yoi

∈ Y.The component xj (j = 1, . . .n) in the vector xoi

is the measured value of thecorresponding feature Xj ∈ F.

Here, each feature X is defined as a couple of values X : 〈HX, IX〉, where HX

indicates the linguistic name of the feature, and IX is either a numeric intervalor a categorical set that presents the possible range of values for X.

Example 3.1. Consider the leaf data set [206] which is a set of photographedleaf samples (observation set Dl) from various plant species (classes) such as:Y = { yqr : ‘Quercus Robur’, yap : ‘Acer Palmatum’, yno : ‘Nerium Oleander’,ytt : ‘Tilia Tomentosa’, . . .}. This data set includes a set of measurablefeatures to characterise the features of each leaf sample, such as: F = {

Xel : elongation, Xlo : lobedness, Xco : convexity, Xro : roundness,Xso : solidity, Xin : indentation, . . .}. An observed leaf such as oi ∈ Dl

that is labelled by ytt takes the feature values as: oi : (xoi,ytt), where

xoi= [xel, xlo, xco, xro, xso, xin].

The goal described in this chapter is to find a mapping from the elements ofa data set M to various components needed to define a conceptual space S. Inshort, this mapping is achieved by performing the following steps:

• Initialise the primitive known concepts using the class labels. Conse-quently, the conceptual space S, which models the data set M, will consistof a set of concepts C = {C1, . . . ,Cm}, where |C| = |Y|. Thus, the notationCy indicates the concept which corresponds to the class label y ∈ Y.1

• Specify the quality dimensions Q and domains Δ. The quality dimensionsare specified via selecting a subset of the features such that Q ⊂ F, andthe domains are determined based on ranking and grouping the set ofselected features as the quality dimensions (Section 3.1).

• Form the representation of each concept Cy within the domains Δ, basedon the known corresponding instances (Section 3.2).

The following sections will outline how the two latter steps are achieved indetail. Figure 3.1 illustrates the steps of constructing a conceptual space from aset of numeric data, which are explained in the following sections.

1This approach follows the assumption that any semantic modelling needs an innate set ofknowledge [176]. Learning concepts without any initialised knowledge is not the scope of thiswork.


Tobegin,itisassumedthatagivendatasetMcontainsasetofpossibleclasslabels,asetofpredefinedfeatures,andtheinputobservationswithknownclasslabels,whicharecharacterisedbyfeaturevalues.GivenasetofclasslabelsY={y1,...,ym}andasetoffeaturesF={X1,...,Xn},letDbethesetofknownobservations,denotedbyD={oi:(xoi,yoi)},whereoiconsistsofan-dimensionalfeaturevectorxoi=[x1,...,xn],andanoutputlabelyoi∈Y.Thecomponentxj(j=1,...n)inthevectorxoiisthemeasuredvalueofthecorrespondingfeatureXj∈F.

Here,eachfeatureXisdefinedasacoupleofvaluesX:〈HX,IX〉,whereHX

indicatesthelinguisticnameofthefeature,andIXiseitheranumericintervaloracategoricalsetthatpresentsthepossiblerangeofvaluesforX.

Example3.1.Considertheleafdataset[206]whichisasetofphotographedleafsamples(observationsetD

l)fromvariousplantspecies(classes)suchas:

Y={yqr:‘QuercusRobur’,yap:‘AcerPalmatum’,yno:‘NeriumOleander’,ytt:‘TiliaTomentosa’,...}.Thisdatasetincludesasetofmeasurablefeaturestocharacterisethefeaturesofeachleafsample,suchas:F={

Xel:elongation,Xlo:lobedness,Xco:convexity,Xro:roundness,Xso:solidity,Xin:indentation,...}.Anobservedleafsuchasoi∈D

l

thatislabelledbyytttakesthefeaturevaluesas:oi:(xoi,ytt),wherexoi=[xel,xlo,xco,xro,xso,xin].

ThegoaldescribedinthischapteristofindamappingfromtheelementsofadatasetMtovariouscomponentsneededtodefineaconceptualspaceS.Inshort,thismappingisachievedbyperformingthefollowingsteps:

•Initialisetheprimitiveknownconceptsusingtheclasslabels.Conse-quently,theconceptualspaceS,whichmodelsthedatasetM,willconsistofasetofconceptsC={C1,...,Cm},where|C|=|Y|.Thus,thenotationCyindicatestheconceptwhichcorrespondstotheclasslabely∈Y.1

•SpecifythequalitydimensionsQanddomainsΔ.ThequalitydimensionsarespecifiedviaselectingasubsetofthefeaturessuchthatQ⊂F,andthedomainsaredeterminedbasedonrankingandgroupingthesetofselectedfeaturesasthequalitydimensions(Section3.1).

•FormtherepresentationofeachconceptCywithinthedomainsΔ,basedontheknowncorrespondinginstances(Section3.2).

Thefollowingsectionswilloutlinehowthetwolatterstepsareachievedindetail.Figure3.1illustratesthestepsofconstructingaconceptualspacefromasetofnumericdata,whichareexplainedinthefollowingsections.

1Thisapproachfollowstheassumptionthatanysemanticmodellingneedsaninnatesetofknowledge[176].Learningconceptswithoutanyinitialisedknowledgeisnotthescopeofthiswork.

3.1. DOMAIN AND QUALITY DIMENSION SPECIFICATION 37

Figure 3.1: Illustration of the main steps for constructing a conceptual spacefrom a set of numeric data. The domain and dimension specification is ex-plained in Section 3.1, and the concept representation is described in Sec-tion 3.2.

3.1 Domain and Quality Dimension Specification: A

Feature Selection Approach

A data-driven approach to build a conceptual space makes no prior assump-tion about the domains. Preferably, the known labelled observations and fea-tures are the inputs from which the quality dimensions and domains will beextracted. This approach aims to propose a set of observation-based associa-tions between classes of data as concepts and the grouped subsets of featuresas domains. As a domain is an integrated subset of quality dimensions, and thequality dimensions are the subset of the initialised features, the first step is todetermine a subset of informative features. This determination is performed byapplying feature selection methods. Before explaining these methods, the for-mal definitions of a quality dimension and a domain is recalled. As mentionedbefore, these definitions are reshaped in a novel manner to be utilised in thetask of semantic inference.

Definition 3.2. A quality dimension qX ∈ Q is a triple 〈Hq, Iq,μq〉, whichcorresponds to a selected feature X ∈ F. Hq is the linguistic name of the qualitydimension qX, which is equal to HX and Iq indicates the range of possiblevalues for the quality dimension qX, which is equal to IX. μq is defined as afamily of fuzzy membership functions2 to map the subintervals of Iq onto a setof linguistic terms.

Definition 3.3. A domain δ is a triple 〈Q(δ),C(δ),ωδ〉, where Q(δ) ⊂ Q is theset of integral quality dimensions involved in δ, C(δ) ⊂ C is the set of conceptsthat are represented in δ, and ω(δ) is a weight function3 presenting the assignedsalient weight between a concept and a quality dimension in δ.

2More details on the fuzzy membership functions μ, which quantify the changes in the val-ues of a feature by assigning linguistic labels to the subintervals of dimension, will be given inSection 4.2.1.

3The weight function ω(δ) is further explained in Section 3.1.2.

3.1.DOMAINANDQUALITYDIMENSIONSPECIFICATION37

Figure3.1:Illustrationofthemainstepsforconstructingaconceptualspacefromasetofnumericdata.Thedomainanddimensionspecificationisex-plainedinSection3.1,andtheconceptrepresentationisdescribedinSec-tion3.2.

3.1DomainandQualityDimensionSpecification:A

FeatureSelectionApproach

Adata-drivenapproachtobuildaconceptualspacemakesnopriorassump-tionaboutthedomains.Preferably,theknownlabelledobservationsandfea-turesaretheinputsfromwhichthequalitydimensionsanddomainswillbeextracted.Thisapproachaimstoproposeasetofobservation-basedassocia-tionsbetweenclassesofdataasconceptsandthegroupedsubsetsoffeaturesasdomains.Asadomainisanintegratedsubsetofqualitydimensions,andthequalitydimensionsarethesubsetoftheinitialisedfeatures,thefirststepistodetermineasubsetofinformativefeatures.Thisdeterminationisperformedbyapplyingfeatureselectionmethods.Beforeexplainingthesemethods,thefor-maldefinitionsofaqualitydimensionandadomainisrecalled.Asmentionedbefore,thesedefinitionsarereshapedinanovelmannertobeutilisedinthetaskofsemanticinference.

Definition3.2.AqualitydimensionqX∈Qisatriple〈Hq,Iq,μq〉,whichcorrespondstoaselectedfeatureX∈F.HqisthelinguisticnameofthequalitydimensionqX,whichisequaltoHXandIqindicatestherangeofpossiblevaluesforthequalitydimensionqX,whichisequaltoIX.μqisdefinedasafamilyoffuzzymembershipfunctions2tomapthesubintervalsofIqontoasetoflinguisticterms.

Definition3.3.Adomainδisatriple〈Q(δ),C(δ),ωδ〉,whereQ(δ)⊂Qisthesetofintegralqualitydimensionsinvolvedinδ,C(δ)⊂Cisthesetofconceptsthatarerepresentedinδ,andω(δ)isaweightfunction3presentingtheassignedsalientweightbetweenaconceptandaqualitydimensioninδ.

2Moredetailsonthefuzzymembershipfunctionsμ,whichquantifythechangesintheval-uesofafeaturebyassigninglinguisticlabelstothesubintervalsofdimension,willbegiveninSection4.2.1.

3Theweightfunctionω(δ)isfurtherexplainedinSection3.1.2.


Figure 3.1: Illustration of the main steps for constructing a conceptual spacefrom a set of numeric data. The domain and dimension specification is ex-plained in Section 3.1, and the concept representation is described in Sec-tion 3.2.

3.1 Domain and Quality Dimension Specification: A

Feature Selection Approach

A data-driven approach to build a conceptual space makes no prior assump-tion about the domains. Preferably, the known labelled observations and fea-tures are the inputs from which the quality dimensions and domains will beextracted. This approach aims to propose a set of observation-based associa-tions between classes of data as concepts and the grouped subsets of featuresas domains. As a domain is an integrated subset of quality dimensions, and thequality dimensions are the subset of the initialised features, the first step is todetermine a subset of informative features. This determination is performed byapplying feature selection methods. Before explaining these methods, the for-mal definitions of a quality dimension and a domain is recalled. As mentionedbefore, these definitions are reshaped in a novel manner to be utilised in thetask of semantic inference.

Definition 3.2. A quality dimension qX ∈ Q is a triple 〈Hq, Iq,μq〉, whichcorresponds to a selected feature X ∈ F. Hq is the linguistic name of the qualitydimension qX, which is equal to HX and Iq indicates the range of possiblevalues for the quality dimension qX, which is equal to IX. μq is defined as afamily of fuzzy membership functions2 to map the subintervals of Iq onto a setof linguistic terms.

Definition 3.3. A domain δ is a triple 〈Q(δ),C(δ),ωδ〉, where Q(δ) ⊂ Q is theset of integral quality dimensions involved in δ, C(δ) ⊂ C is the set of conceptsthat are represented in δ, and ω(δ) is a weight function3 presenting the assignedsalient weight between a concept and a quality dimension in δ.

2More details on the fuzzy membership functions μ, which quantify the changes in the val-ues of a feature by assigning linguistic labels to the subintervals of dimension, will be given inSection 4.2.1.

3The weight function ω(δ) is further explained in Section 3.1.2.


Figure3.1:Illustrationofthemainstepsforconstructingaconceptualspacefromasetofnumericdata.Thedomainanddimensionspecificationisex-plainedinSection3.1,andtheconceptrepresentationisdescribedinSec-tion3.2.

3.1DomainandQualityDimensionSpecification:A

FeatureSelectionApproach

Adata-drivenapproachtobuildaconceptualspacemakesnopriorassump-tionaboutthedomains.Preferably,theknownlabelledobservationsandfea-turesaretheinputsfromwhichthequalitydimensionsanddomainswillbeextracted.Thisapproachaimstoproposeasetofobservation-basedassocia-tionsbetweenclassesofdataasconceptsandthegroupedsubsetsoffeaturesasdomains.Asadomainisanintegratedsubsetofqualitydimensions,andthequalitydimensionsarethesubsetoftheinitialisedfeatures,thefirststepistodetermineasubsetofinformativefeatures.Thisdeterminationisperformedbyapplyingfeatureselectionmethods.Beforeexplainingthesemethods,thefor-maldefinitionsofaqualitydimensionandadomainisrecalled.Asmentionedbefore,thesedefinitionsarereshapedinanovelmannertobeutilisedinthetaskofsemanticinference.

Definition3.2.AqualitydimensionqX∈Qisatriple〈Hq,Iq,μq〉,whichcorrespondstoaselectedfeatureX∈F.HqisthelinguisticnameofthequalitydimensionqX,whichisequaltoHXandIqindicatestherangeofpossiblevaluesforthequalitydimensionqX,whichisequaltoIX.μqisdefinedasafamilyoffuzzymembershipfunctions2tomapthesubintervalsofIqontoasetoflinguisticterms.

Definition3.3.Adomainδisatriple〈Q(δ),C(δ),ωδ〉,whereQ(δ)⊂Qisthesetofintegralqualitydimensionsinvolvedinδ,C(δ)⊂Cisthesetofconceptsthatarerepresentedinδ,andω(δ)isaweightfunction3presentingtheassignedsalientweightbetweenaconceptandaqualitydimensioninδ.

2Moredetailsonthefuzzymembershipfunctionsμ,whichquantifythechangesintheval-uesofafeaturebyassigninglinguisticlabelstothesubintervalsofdimension,willbegiveninSection4.2.1.

3Theweightfunctionω(δ)isfurtherexplainedinSection3.1.2.


Example 3.2. Consider the leaf data set from Example 3.1, suppose that a qual-ity dimension is elongation, which is defined as qel = 〈 ‘elongation’, [0, 1],μel〉, and another one is lobedness, defined as qlo = 〈 ‘lobedness’, (0, inf),μlo〉. One can conceptualise the leaves in various domains such as Shape, Tex-ture, Colour, etc. Then, both the elongation and lobedness quality dimensionscan belong to the shape domain. Moreover, μel can return the linguistic labelsfor elongation as: ‘circular’, ‘elliptical’, ‘elongated’.

Since the domains are constructed in a data-driven way without involv-ing prior knowledge, it can be difficult to assign a semantic interpretation tothe constructed domains that reflect human perception. However, the providedspace still constitutes a conceptual one because of its ability to represent theconcept formation and the semantic similarities between concepts and instancesacross the domains [87]. With quality dimensions in place, domain is then anintegral subset of quality dimensions which are relevant to each other.

Identifying the most characteristic features of the data from the initialisedset of features is an essential task, which is performed by feature extractionapproaches [42]. There are two principal ways to extract informative features:feature transformation and feature selection [110]. The first approach findsa projection from the original feature space into a lower dimensional featurespace. Transforming the original features into this lower dimensional space al-ters typically any associated descriptive attributes connected to the features.Therefore, the semantic meaning of the resulting features is often difficult, ifnot impossible, to assess [109]. The second approach selects a subset of origi-nal features by keeping relevant features and discarding the irrelevant ones. Theretained features are not altered and the original semantic meaning of those fea-tures stays intact. Since the goal is to exploit external knowledge of the originalfeatures, feature subset selection techniques are applied.

Both relevance, and redundancy are important criteria to consider in featureselection. A subset of features is optimal if the relevance between selected fea-tures and the target classes is maximal, and the redundancy among the selectedfeatures is minimal. These two criteria guarantee that the selected features areadequate to distinguish the classes of data with the smallest number of fea-tures [75].

The proposed approach employs feature selection methods to specify thequality dimensions and domains using two phases: feature subset ranking andfeature subset grouping. The feature subset ranking phase determines whichfeatures are most representative for every single target class, independent fromother classes. Since a concept can rather be represented by one or several groupsof features as domains, the feature subset grouping phase categorises the rankedfeatures in a way to recognise what subset of features are most related to eachother based on their relevancy to the concepts.

Figure 3.2 shows the two phases of (1) feature subset ranking and (2) fea-ture subset grouping, along with the input and output parameters, to specify


Example3.2.ConsidertheleafdatasetfromExample3.1,supposethataqual-itydimensioniselongation,whichisdefinedasqel=〈‘elongation’,[0,1],μel〉,andanotheroneislobedness,definedasqlo=〈‘lobedness’,(0,inf),μlo〉.OnecanconceptualisetheleavesinvariousdomainssuchasShape,Tex-ture,Colour,etc.Then,boththeelongationandlobednessqualitydimensionscanbelongtotheshapedomain.Moreover,μelcanreturnthelinguisticlabelsforelongationas:‘circular’,‘elliptical’,‘elongated’.

Sincethedomainsareconstructedinadata-drivenwaywithoutinvolv-ingpriorknowledge,itcanbedifficulttoassignasemanticinterpretationtotheconstructeddomainsthatreflecthumanperception.However,theprovidedspacestillconstitutesaconceptualonebecauseofitsabilitytorepresenttheconceptformationandthesemanticsimilaritiesbetweenconceptsandinstancesacrossthedomains[87].Withqualitydimensionsinplace,domainisthenanintegralsubsetofqualitydimensionswhicharerelevanttoeachother.

Identifyingthemostcharacteristicfeaturesofthedatafromtheinitialisedsetoffeaturesisanessentialtask,whichisperformedbyfeatureextractionapproaches[42].Therearetwoprincipalwaystoextractinformativefeatures:featuretransformationandfeatureselection[110].Thefirstapproachfindsaprojectionfromtheoriginalfeaturespaceintoalowerdimensionalfeaturespace.Transformingtheoriginalfeaturesintothislowerdimensionalspaceal-terstypicallyanyassociateddescriptiveattributesconnectedtothefeatures.Therefore,thesemanticmeaningoftheresultingfeaturesisoftendifficult,ifnotimpossible,toassess[109].Thesecondapproachselectsasubsetoforigi-nalfeaturesbykeepingrelevantfeaturesanddiscardingtheirrelevantones.Theretainedfeaturesarenotalteredandtheoriginalsemanticmeaningofthosefea-turesstaysintact.Sincethegoalistoexploitexternalknowledgeoftheoriginalfeatures,featuresubsetselectiontechniquesareapplied.

Bothrelevance,andredundancyareimportantcriteriatoconsiderinfeatureselection.Asubsetoffeaturesisoptimaliftherelevancebetweenselectedfea-turesandthetargetclassesismaximal,andtheredundancyamongtheselectedfeaturesisminimal.Thesetwocriteriaguaranteethattheselectedfeaturesareadequatetodistinguishtheclassesofdatawiththesmallestnumberoffea-tures[75].

Theproposedapproachemploysfeatureselectionmethodstospecifythequalitydimensionsanddomainsusingtwophases:featuresubsetrankingandfeaturesubsetgrouping.Thefeaturesubsetrankingphasedetermineswhichfeaturesaremostrepresentativeforeverysingletargetclass,independentfromotherclasses.Sinceaconceptcanratherberepresentedbyoneorseveralgroupsoffeaturesasdomains,thefeaturesubsetgroupingphasecategorisestherankedfeaturesinawaytorecognisewhatsubsetoffeaturesaremostrelatedtoeachotherbasedontheirrelevancytotheconcepts.

Figure3.2showsthetwophasesof(1)featuresubsetrankingand(2)fea-turesubsetgrouping,alongwiththeinputandoutputparameters,tospecify


Example 3.2. Consider the leaf data set from Example 3.1, suppose that a qual-ity dimension is elongation, which is defined as qel = 〈 ‘elongation’, [0, 1],μel〉, and another one is lobedness, defined as qlo = 〈 ‘lobedness’, (0, inf),μlo〉. One can conceptualise the leaves in various domains such as Shape, Tex-ture, Colour, etc. Then, both the elongation and lobedness quality dimensionscan belong to the shape domain. Moreover, μel can return the linguistic labelsfor elongation as: ‘circular’, ‘elliptical’, ‘elongated’.

Since the domains are constructed in a data-driven way without involv-ing prior knowledge, it can be difficult to assign a semantic interpretation tothe constructed domains that reflect human perception. However, the providedspace still constitutes a conceptual one because of its ability to represent theconcept formation and the semantic similarities between concepts and instancesacross the domains [87]. With quality dimensions in place, domain is then anintegral subset of quality dimensions which are relevant to each other.

Identifying the most characteristic features of the data from the initialisedset of features is an essential task, which is performed by feature extractionapproaches [42]. There are two principal ways to extract informative features:feature transformation and feature selection [110]. The first approach findsa projection from the original feature space into a lower dimensional featurespace. Transforming the original features into this lower dimensional space al-ters typically any associated descriptive attributes connected to the features.Therefore, the semantic meaning of the resulting features is often difficult, ifnot impossible, to assess [109]. The second approach selects a subset of origi-nal features by keeping relevant features and discarding the irrelevant ones. Theretained features are not altered and the original semantic meaning of those fea-tures stays intact. Since the goal is to exploit external knowledge of the originalfeatures, feature subset selection techniques are applied.

Both relevance, and redundancy are important criteria to consider in featureselection. A subset of features is optimal if the relevance between selected fea-tures and the target classes is maximal, and the redundancy among the selectedfeatures is minimal. These two criteria guarantee that the selected features areadequate to distinguish the classes of data with the smallest number of fea-tures [75].

The proposed approach employs feature selection methods to specify thequality dimensions and domains using two phases: feature subset ranking andfeature subset grouping. The feature subset ranking phase determines whichfeatures are most representative for every single target class, independent fromother classes. Since a concept can rather be represented by one or several groupsof features as domains, the feature subset grouping phase categorises the rankedfeatures in a way to recognise what subset of features are most related to eachother based on their relevancy to the concepts.

Figure 3.2 shows the two phases of (1) feature subset ranking and (2) fea-ture subset grouping, along with the input and output parameters, to specify


Example3.2.ConsidertheleafdatasetfromExample3.1,supposethataqual-itydimensioniselongation,whichisdefinedasqel=〈‘elongation’,[0,1],μel〉,andanotheroneislobedness,definedasqlo=〈‘lobedness’,(0,inf),μlo〉.OnecanconceptualisetheleavesinvariousdomainssuchasShape,Tex-ture,Colour,etc.Then,boththeelongationandlobednessqualitydimensionscanbelongtotheshapedomain.Moreover,μelcanreturnthelinguisticlabelsforelongationas:‘circular’,‘elliptical’,‘elongated’.

Sincethedomainsareconstructedinadata-drivenwaywithoutinvolv-ingpriorknowledge,itcanbedifficulttoassignasemanticinterpretationtotheconstructeddomainsthatreflecthumanperception.However,theprovidedspacestillconstitutesaconceptualonebecauseofitsabilitytorepresenttheconceptformationandthesemanticsimilaritiesbetweenconceptsandinstancesacrossthedomains[87].Withqualitydimensionsinplace,domainisthenanintegralsubsetofqualitydimensionswhicharerelevanttoeachother.

Identifyingthemostcharacteristicfeaturesofthedatafromtheinitialisedsetoffeaturesisanessentialtask,whichisperformedbyfeatureextractionapproaches[42].Therearetwoprincipalwaystoextractinformativefeatures:featuretransformationandfeatureselection[110].Thefirstapproachfindsaprojectionfromtheoriginalfeaturespaceintoalowerdimensionalfeaturespace.Transformingtheoriginalfeaturesintothislowerdimensionalspaceal-terstypicallyanyassociateddescriptiveattributesconnectedtothefeatures.Therefore,thesemanticmeaningoftheresultingfeaturesisoftendifficult,ifnotimpossible,toassess[109].Thesecondapproachselectsasubsetoforigi-nalfeaturesbykeepingrelevantfeaturesanddiscardingtheirrelevantones.Theretainedfeaturesarenotalteredandtheoriginalsemanticmeaningofthosefea-turesstaysintact.Sincethegoalistoexploitexternalknowledgeoftheoriginalfeatures,featuresubsetselectiontechniquesareapplied.

Bothrelevance,andredundancyareimportantcriteriatoconsiderinfeatureselection.Asubsetoffeaturesisoptimaliftherelevancebetweenselectedfea-turesandthetargetclassesismaximal,andtheredundancyamongtheselectedfeaturesisminimal.Thesetwocriteriaguaranteethattheselectedfeaturesareadequatetodistinguishtheclassesofdatawiththesmallestnumberoffea-tures[75].

Theproposedapproachemploysfeatureselectionmethodstospecifythequalitydimensionsanddomainsusingtwophases:featuresubsetrankingandfeaturesubsetgrouping.Thefeaturesubsetrankingphasedetermineswhichfeaturesaremostrepresentativeforeverysingletargetclass,independentfromotherclasses.Sinceaconceptcanratherberepresentedbyoneorseveralgroupsoffeaturesasdomains,thefeaturesubsetgroupingphasecategorisestherankedfeaturesinawaytorecognisewhatsubsetoffeaturesaremostrelatedtoeachotherbasedontheirrelevancytotheconcepts.

Figure3.2showsthetwophasesof(1)featuresubsetrankingand(2)fea-turesubsetgrouping,alongwiththeinputandoutputparameters,tospecify


Feature SubsetGrouping

(algorithm 3.2)

Feature Sub-set Ranking

(algorithm 3.1)

R,F ′D={o1, ...,oN}

Y={y1, ...,ym}

F={X1, ...,Xn}

Δ

Q

Domain and Dimension Specification

Figure 3.2: Two phases of the domain and quality dimension specification, withinput and output parameters of each phase.

the suitable quality dimensions within a set of domains (the middle block inFigure 3.1). The following sections explain these two phases in details.

3.1.1 Feature Subset Ranking

Feature subset selection algorithms are categorized as either filter methods orwrapper methods [230]. Filter methods determine the subset of features basedon the statistical characteristics of the input data set without referring to theused classifier. Wrapper methods are dependent on the learning algorithm (i.e.,target classifier) that evaluates the selected subset of features based on the per-formance of the used learning algorithm. In the present work, the aim is toidentify the meaningful set of understandable attributes out of predefined fea-tures, but not to classify the input data. Filter methods are chosen to be usedfor feature selection since this category of methods is independent of the finalclassifier approach, and it derives an informative subset of features concerningthe input data set labels. It is worth to note that although the filter methods aremore computationally efficient, the evaluation of such methods is not a trivialtask [230] since there is no universally accepted relevance measure between asubset of selected features and the target class labels.

Filter methods rank the features using a scoring function, usually by em-ploying a statistical measure or the measures from information theory to quan-tify relevance and redundancy. The top scored features are kept as selectedfeatures (or low scored ones are removed from the resulting subset). In thiswork, mutual information, one of the commonly employed scoring functions,is used [47]. One such technique is MIFS (mutual information-based featureselection) [34]. For an input set of features F and 2-class labelled data D, MIFSadds the feature Xi ∈ F to the already chosen subset of features F ′, to maximise

I(D,Xi) − β∑

Xj∈F ′I(Xi,Xj), (3.1)


FeatureSubsetGrouping

(algorithm3.2)

FeatureSub-setRanking

(algorithm3.1)

R,F′D={o1,...,oN}

Y={y1,...,ym}

F={X1,...,Xn}

Δ

Q

DomainandDimensionSpecification

Figure3.2:Twophasesofthedomainandqualitydimensionspecification,withinputandoutputparametersofeachphase.

thesuitablequalitydimensionswithinasetofdomains(themiddleblockinFigure3.1).Thefollowingsectionsexplainthesetwophasesindetails.

3.1.1FeatureSubsetRanking

Featuresubsetselectionalgorithmsarecategorizedaseitherfiltermethodsorwrappermethods[230].Filtermethodsdeterminethesubsetoffeaturesbasedonthestatisticalcharacteristicsoftheinputdatasetwithoutreferringtotheusedclassifier.Wrappermethodsaredependentonthelearningalgorithm(i.e.,targetclassifier)thatevaluatestheselectedsubsetoffeaturesbasedontheper-formanceoftheusedlearningalgorithm.Inthepresentwork,theaimistoidentifythemeaningfulsetofunderstandableattributesoutofpredefinedfea-tures,butnottoclassifytheinputdata.Filtermethodsarechosentobeusedforfeatureselectionsincethiscategoryofmethodsisindependentofthefinalclassifierapproach,anditderivesaninformativesubsetoffeaturesconcerningtheinputdatasetlabels.Itisworthtonotethatalthoughthefiltermethodsaremorecomputationallyefficient,theevaluationofsuchmethodsisnotatrivialtask[230]sincethereisnouniversallyacceptedrelevancemeasurebetweenasubsetofselectedfeaturesandthetargetclasslabels.

Filtermethodsrankthefeaturesusingascoringfunction,usuallybyem-ployingastatisticalmeasureorthemeasuresfrominformationtheorytoquan-tifyrelevanceandredundancy.Thetopscoredfeaturesarekeptasselectedfeatures(orlowscoredonesareremovedfromtheresultingsubset).Inthiswork,mutualinformation,oneofthecommonlyemployedscoringfunctions,isused[47].OnesuchtechniqueisMIFS(mutualinformation-basedfeatureselection)[34].ForaninputsetoffeaturesFand2-classlabelleddataD,MIFSaddsthefeatureXi∈FtothealreadychosensubsetoffeaturesF′,tomaximise

I(D,Xi)−β∑Xj∈F′

I(Xi,Xj),(3.1)


Feature SubsetGrouping

(algorithm 3.2)

Feature Sub-set Ranking

(algorithm 3.1)

R,F ′D={o1, ...,oN}

Y={y1, ...,ym}

F={X1, ...,Xn}

Δ

Q

Domain and Dimension Specification

Figure 3.2: Two phases of the domain and quality dimension specification, withinput and output parameters of each phase.

the suitable quality dimensions within a set of domains (the middle block inFigure 3.1). The following sections explain these two phases in details.

3.1.1 Feature Subset Ranking

Feature subset selection algorithms are categorized as either filter methods orwrapper methods [230]. Filter methods determine the subset of features basedon the statistical characteristics of the input data set without referring to theused classifier. Wrapper methods are dependent on the learning algorithm (i.e.,target classifier) that evaluates the selected subset of features based on the per-formance of the used learning algorithm. In the present work, the aim is toidentify the meaningful set of understandable attributes out of predefined fea-tures, but not to classify the input data. Filter methods are chosen to be usedfor feature selection since this category of methods is independent of the finalclassifier approach, and it derives an informative subset of features concerningthe input data set labels. It is worth to note that although the filter methods aremore computationally efficient, the evaluation of such methods is not a trivialtask [230] since there is no universally accepted relevance measure between asubset of selected features and the target class labels.

Filter methods rank the features using a scoring function, usually by em-ploying a statistical measure or the measures from information theory to quan-tify relevance and redundancy. The top scored features are kept as selectedfeatures (or low scored ones are removed from the resulting subset). In thiswork, mutual information, one of the commonly employed scoring functions,is used [47]. One such technique is MIFS (mutual information-based featureselection) [34]. For an input set of features F and 2-class labelled data D, MIFSadds the feature Xi ∈ F to the already chosen subset of features F ′, to maximise

I(D,Xi) − β∑

Xj∈F ′I(Xi,Xj), (3.1)


FeatureSubsetGrouping

(algorithm3.2)

FeatureSub-setRanking

(algorithm3.1)

R,F′D={o1,...,oN}

Y={y1,...,ym}

F={X1,...,Xn}

Δ

Q

DomainandDimensionSpecification

Figure3.2:Twophasesofthedomainandqualitydimensionspecification,withinputandoutputparametersofeachphase.

thesuitablequalitydimensionswithinasetofdomains(themiddleblockinFigure3.1).Thefollowingsectionsexplainthesetwophasesindetails.

3.1.1FeatureSubsetRanking

Featuresubsetselectionalgorithmsarecategorizedaseitherfiltermethodsorwrappermethods[230].Filtermethodsdeterminethesubsetoffeaturesbasedonthestatisticalcharacteristicsoftheinputdatasetwithoutreferringtotheusedclassifier.Wrappermethodsaredependentonthelearningalgorithm(i.e.,targetclassifier)thatevaluatestheselectedsubsetoffeaturesbasedontheper-formanceoftheusedlearningalgorithm.Inthepresentwork,theaimistoidentifythemeaningfulsetofunderstandableattributesoutofpredefinedfea-tures,butnottoclassifytheinputdata.Filtermethodsarechosentobeusedforfeatureselectionsincethiscategoryofmethodsisindependentofthefinalclassifierapproach,anditderivesaninformativesubsetoffeaturesconcerningtheinputdatasetlabels.Itisworthtonotethatalthoughthefiltermethodsaremorecomputationallyefficient,theevaluationofsuchmethodsisnotatrivialtask[230]sincethereisnouniversallyacceptedrelevancemeasurebetweenasubsetofselectedfeaturesandthetargetclasslabels.

Filtermethodsrankthefeaturesusingascoringfunction,usuallybyem-ployingastatisticalmeasureorthemeasuresfrominformationtheorytoquan-tifyrelevanceandredundancy.Thetopscoredfeaturesarekeptasselectedfeatures(orlowscoredonesareremovedfromtheresultingsubset).Inthiswork,mutualinformation,oneofthecommonlyemployedscoringfunctions,isused[47].OnesuchtechniqueisMIFS(mutualinformation-basedfeatureselection)[34].ForaninputsetoffeaturesFand2-classlabelleddataD,MIFSaddsthefeatureXi∈FtothealreadychosensubsetoffeaturesF′,tomaximise

I(D,Xi)−β∑Xj∈F′

I(Xi,Xj),(3.1)


where I(Y,X) is the mutual information between the variables Y and X [223].This mutual information is defined based on the probability density func-tions [65], denoted by:

I(X, Y) =∫x

∫y

p(x,y) logp(x,y)p(x)p(y)

dydx. (3.2)

The first term in Equation 3.1 attempts to maximise the relevance of featureXi to target labelled data, and the second term tries to minimise the redundancybetween Xi and the already selected features in F ′ (using a balancing parameterβ). In this work, the term I(Y,X) is estimated via histograms, but other estima-tion methods are applicable as well [196]. The MIFS technique is a heuristicapproximation, since there is no independent assessment of the joint mutualinformation to determine when a feature is relevant to the class labels [223].The proposed method is not dependent on the use of the MIFS algorithm. Itcan be substituted by other approximations of the joint mutual information[48, 80, 171]. It is notable that different filter methods do not necessarily pro-duce the same ranking of the features. However, the focus here is to reach toa good enough set of features representing the data classes with high relevanceand low redundancy [228].

The proposed method for feature subset ranking starts with defining a newset of input data for each target label y, wherein the data set of known observa-tions D is split into two classes: class y including all the observations labelledby class y, denoted by Dy, and class y = {Y\y} including the rest of obser-vations labelled by other classes than class y, denoted by Dy. Then the MIFSprocedure is applied to the feature set F considering the generated 2-class dataset Dyy = {Dy ∪Dy}.

Example 3.3. In the leaf data set from Example 3.1, for the class of the leafspecies Tilia (ytt), the set of Dytt is the leaf samples in Dl that are labelledwith ytt, and Dytt

is the set of samples in Dl that are not labelled with ytt, orconsequently, are labelled with one of the labels yqr, yap, or yia.

By separating one class (concept) of data from the other classes, the out-put of the feature ranking algorithm will return the features that individuallycharacterise the observations of this concept and separate it from the rest. Theoutput of the filter method for each label is a sorted list of features with a scorefor each feature. Formally, the output for a class y is a ranked list of featureswith the highest scores according to y, as

R(y) = { (X,wy,X) | X ∈ F , wy,X ∈ [0, 1] } (3.3)

where wy,X is the normalised weight (or the score) that is assigned to the re-lation of feature X and label y. The features in R(y) are the k most relevantfeatures of the class label y. From a conceptual point of view, these k features


whereI(Y,X)isthemutualinformationbetweenthevariablesYandX[223].Thismutualinformationisdefinedbasedontheprobabilitydensityfunc-tions[65],denotedby:

I(X,Y)=∫x

∫y

p(x,y)logp(x,y)p(x)p(y)

dydx.(3.2)

ThefirstterminEquation3.1attemptstomaximisetherelevanceoffeatureXitotargetlabelleddata,andthesecondtermtriestominimisetheredundancybetweenXiandthealreadyselectedfeaturesinF′(usingabalancingparameterβ).Inthiswork,thetermI(Y,X)isestimatedviahistograms,butotherestima-tionmethodsareapplicableaswell[196].TheMIFStechniqueisaheuristicapproximation,sincethereisnoindependentassessmentofthejointmutualinformationtodeterminewhenafeatureisrelevanttotheclasslabels[223].TheproposedmethodisnotdependentontheuseoftheMIFSalgorithm.Itcanbesubstitutedbyotherapproximationsofthejointmutualinformation[48,80,171].Itisnotablethatdifferentfiltermethodsdonotnecessarilypro-ducethesamerankingofthefeatures.However,thefocushereistoreachtoagoodenoughsetoffeaturesrepresentingthedataclasseswithhighrelevanceandlowredundancy[228].

Theproposedmethodforfeaturesubsetrankingstartswithdefininganewsetofinputdataforeachtargetlabely,whereinthedatasetofknownobserva-tionsDissplitintotwoclasses:classyincludingalltheobservationslabelledbyclassy,denotedbyDy,andclassy={Y\y}includingtherestofobser-vationslabelledbyotherclassesthanclassy,denotedbyDy.ThentheMIFSprocedureisappliedtothefeaturesetFconsideringthegenerated2-classdatasetDyy={Dy∪Dy}.

Example3.3.IntheleafdatasetfromExample3.1,fortheclassoftheleafspeciesTilia(ytt),thesetofDyttistheleafsamplesinD

lthatarelabelled

withytt,andDyttisthesetofsamplesinDl

thatarenotlabelledwithytt,orconsequently,arelabelledwithoneofthelabelsyqr,yap,oryia.

Byseparatingoneclass(concept)ofdatafromtheotherclasses,theout-putofthefeaturerankingalgorithmwillreturnthefeaturesthatindividuallycharacterisetheobservationsofthisconceptandseparateitfromtherest.Theoutputofthefiltermethodforeachlabelisasortedlistoffeatureswithascoreforeachfeature.Formally,theoutputforaclassyisarankedlistoffeatureswiththehighestscoresaccordingtoy,as

R(y)={(X,wy,X)|X∈F,wy,X∈[0,1]}(3.3)

wherewy,Xisthenormalisedweight(orthescore)thatisassignedtothere-lationoffeatureXandlabely.ThefeaturesinR(y)arethekmostrelevantfeaturesoftheclasslabely.Fromaconceptualpointofview,thesekfeatures


where I(Y,X) is the mutual information between the variables Y and X [223].This mutual information is defined based on the probability density func-tions [65], denoted by:

I(X, Y) =∫x

∫y

p(x,y) logp(x,y)p(x)p(y)

dydx. (3.2)

The first term in Equation 3.1 attempts to maximise the relevance of featureXi to target labelled data, and the second term tries to minimise the redundancybetween Xi and the already selected features in F ′ (using a balancing parameterβ). In this work, the term I(Y,X) is estimated via histograms, but other estima-tion methods are applicable as well [196]. The MIFS technique is a heuristicapproximation, since there is no independent assessment of the joint mutualinformation to determine when a feature is relevant to the class labels [223].The proposed method is not dependent on the use of the MIFS algorithm. Itcan be substituted by other approximations of the joint mutual information[48, 80, 171]. It is notable that different filter methods do not necessarily pro-duce the same ranking of the features. However, the focus here is to reach toa good enough set of features representing the data classes with high relevanceand low redundancy [228].

The proposed method for feature subset ranking starts with defining a newset of input data for each target label y, wherein the data set of known observa-tions D is split into two classes: class y including all the observations labelledby class y, denoted by Dy, and class y = {Y\y} including the rest of obser-vations labelled by other classes than class y, denoted by Dy. Then the MIFSprocedure is applied to the feature set F considering the generated 2-class dataset Dyy = {Dy ∪Dy}.

Example 3.3. In the leaf data set from Example 3.1, for the class of the leafspecies Tilia (ytt), the set of Dytt is the leaf samples in Dl that are labelledwith ytt, and Dytt

is the set of samples in Dl that are not labelled with ytt, orconsequently, are labelled with one of the labels yqr, yap, or yia.

By separating one class (concept) of data from the other classes, the out-put of the feature ranking algorithm will return the features that individuallycharacterise the observations of this concept and separate it from the rest. Theoutput of the filter method for each label is a sorted list of features with a scorefor each feature. Formally, the output for a class y is a ranked list of featureswith the highest scores according to y, as

R(y) = { (X,wy,X) | X ∈ F , wy,X ∈ [0, 1] } (3.3)

where wy,X is the normalised weight (or the score) that is assigned to the re-lation of feature X and label y. The features in R(y) are the k most relevantfeatures of the class label y. From a conceptual point of view, these k features


whereI(Y,X)isthemutualinformationbetweenthevariablesYandX[223].Thismutualinformationisdefinedbasedontheprobabilitydensityfunc-tions[65],denotedby:

I(X,Y)=∫x

∫y

p(x,y)logp(x,y)p(x)p(y)

dydx.(3.2)

ThefirstterminEquation3.1attemptstomaximisetherelevanceoffeatureXitotargetlabelleddata,andthesecondtermtriestominimisetheredundancybetweenXiandthealreadyselectedfeaturesinF′(usingabalancingparameterβ).Inthiswork,thetermI(Y,X)isestimatedviahistograms,butotherestima-tionmethodsareapplicableaswell[196].TheMIFStechniqueisaheuristicapproximation,sincethereisnoindependentassessmentofthejointmutualinformationtodeterminewhenafeatureisrelevanttotheclasslabels[223].TheproposedmethodisnotdependentontheuseoftheMIFSalgorithm.Itcanbesubstitutedbyotherapproximationsofthejointmutualinformation[48,80,171].Itisnotablethatdifferentfiltermethodsdonotnecessarilypro-ducethesamerankingofthefeatures.However,thefocushereistoreachtoagoodenoughsetoffeaturesrepresentingthedataclasseswithhighrelevanceandlowredundancy[228].

Theproposedmethodforfeaturesubsetrankingstartswithdefininganewsetofinputdataforeachtargetlabely,whereinthedatasetofknownobserva-tionsDissplitintotwoclasses:classyincludingalltheobservationslabelledbyclassy,denotedbyDy,andclassy={Y\y}includingtherestofobser-vationslabelledbyotherclassesthanclassy,denotedbyDy.ThentheMIFSprocedureisappliedtothefeaturesetFconsideringthegenerated2-classdatasetDyy={Dy∪Dy}.

Example3.3.IntheleafdatasetfromExample3.1,fortheclassoftheleafspeciesTilia(ytt),thesetofDyttistheleafsamplesinD

lthatarelabelled

withytt,andDyttisthesetofsamplesinDl

thatarenotlabelledwithytt,orconsequently,arelabelledwithoneofthelabelsyqr,yap,oryia.

Byseparatingoneclass(concept)ofdatafromtheotherclasses,theout-putofthefeaturerankingalgorithmwillreturnthefeaturesthatindividuallycharacterisetheobservationsofthisconceptandseparateitfromtherest.Theoutputofthefiltermethodforeachlabelisasortedlistoffeatureswithascoreforeachfeature.Formally,theoutputforaclassyisarankedlistoffeatureswiththehighestscoresaccordingtoy,as

R(y)={(X,wy,X)|X∈F,wy,X∈[0,1]}(3.3)

wherewy,Xisthenormalisedweight(orthescore)thatisassignedtothere-lationoffeatureXandlabely.ThefeaturesinR(y)arethekmostrelevantfeaturesoftheclasslabely.Fromaconceptualpointofview,thesekfeatures


Algorithm 3.1: Feature Subset Ranking

Function FeatureRanking(D,Y,F)R,F ′ ← ∅foreach y ∈ Y do

// Define 2-class data setDy = {oi : (xi,yi) ∈ D | yi = y}

Dy = {oi : (xi,yi) ∈ D | yi = y}

Dyy ← Dy ∪Dy

// Find the list of top scored features (X,wy,X)R(y) ← MIFS(Dyy,F)F ′ ← F ′ ∪ {X | ∃ (X,wy,X) ∈ R(y)∧ X ∈ F,w ∈ [0, 1]}

return R, F ′

of R(y) are the suitable candidates to be the quality dimensions that distin-guishes the concept Cy from the other concepts. The score wy,X determinesthe importance of the selected feature X to represent the class label y. Fromthe conceptual space point of view, the scores indicating the weights show thesignificance of the chosen quality dimensions for Cy.

Algorithm 3.1 shows the steps for finding the ranked scored list of featuresfor each label y. The output of the algorithm is then a set of filter method resultsfor all the class labels, denoted by R = {R(y1), . . . ,R(ym)}. In this algorithm,F ′ is the set of all features that appeared (at least once) in the ranked features:

F ′ =⋃y∈Y

{X| ∃ (X,wy,X) ∈ R(y)∧ X ∈ F,wy,X ∈ [0, 1]}, (3.4)

where F ′ ⊂ F. The set of features F ′ is the set of potential features to becomequality dimensions. However, in feature grouping, some of these features maybe filtered out from the target set of quality dimensions.

Example 3.4. Continuing of the leaf data set in Example 3.1, suppose thatafter applying the MIFS method, elongation, lobedness, and roundness areselected as the top features for ytt, R(ytt) = {(Xel,wytt,Xel), (Xlo,wytt,Xlo),(Xro,wytt,Xro)}. Also, elongation, roundness, and indentation are selected foryno, R(yno) = {(Xel,wyno,Xel), (Xro,wyno,Xro), (Xin,wyno,Xin)}. Then, F ′ ={Xel,Xlo,Xro,Xin}.

3.1.2 Feature Subset Grouping

In order to specify a set of domains as subsets of integral quality dimensionsout of selected features, the method should partition features into a numberof subsets regarding their relevancy to the defined class labels. Based on the


Algorithm3.1:FeatureSubsetRanking

FunctionFeatureRanking(D,Y,F)R,F′←∅foreachy∈Ydo

//Define2-classdatasetDy={oi:(xi,yi)∈D|yi=y}

Dy={oi:(xi,yi)∈D|yi=y}

Dyy←Dy∪Dy

//Findthelistoftopscoredfeatures(X,wy,X)R(y)←MIFS(Dyy,F)F′←F′∪{X|∃(X,wy,X)∈R(y)∧X∈F,w∈[0,1]}

returnR,F′

ofR(y)arethesuitablecandidatestobethequalitydimensionsthatdistin-guishestheconceptCyfromtheotherconcepts.Thescorewy,XdeterminestheimportanceoftheselectedfeatureXtorepresenttheclasslabely.Fromtheconceptualspacepointofview,thescoresindicatingtheweightsshowthesignificanceofthechosenqualitydimensionsforCy.

Algorithm3.1showsthestepsforfindingtherankedscoredlistoffeaturesforeachlabely.Theoutputofthealgorithmisthenasetoffiltermethodresultsforalltheclasslabels,denotedbyR={R(y1),...,R(ym)}.Inthisalgorithm,F′isthesetofallfeaturesthatappeared(atleastonce)intherankedfeatures:

F′=⋃y∈Y

{X|∃(X,wy,X)∈R(y)∧X∈F,wy,X∈[0,1]},(3.4)

whereF′⊂F.ThesetoffeaturesF′isthesetofpotentialfeaturestobecomequalitydimensions.However,infeaturegrouping,someofthesefeaturesmaybefilteredoutfromthetargetsetofqualitydimensions.

Example3.4.ContinuingoftheleafdatasetinExample3.1,supposethatafterapplyingtheMIFSmethod,elongation,lobedness,androundnessareselectedasthetopfeaturesforytt,R(ytt)={(Xel,wytt,Xel),(Xlo,wytt,Xlo),(Xro,wytt,Xro)}.Also,elongation,roundness,andindentationareselectedforyno,R(yno)={(Xel,wyno,Xel),(Xro,wyno,Xro),(Xin,wyno,Xin)}.Then,F′={Xel,Xlo,Xro,Xin}.

3.1.2FeatureSubsetGrouping

Inordertospecifyasetofdomainsassubsetsofintegralqualitydimensionsoutofselectedfeatures,themethodshouldpartitionfeaturesintoanumberofsubsetsregardingtheirrelevancytothedefinedclasslabels.Basedonthe


Algorithm 3.1: Feature Subset Ranking

Function FeatureRanking(D,Y,F)R,F ′ ← ∅foreach y ∈ Y do

// Define 2-class data setDy = {oi : (xi,yi) ∈ D | yi = y}

Dy = {oi : (xi,yi) ∈ D | yi = y}

Dyy ← Dy ∪Dy

// Find the list of top scored features (X,wy,X)R(y) ← MIFS(Dyy,F)F ′ ← F ′ ∪ {X | ∃ (X,wy,X) ∈ R(y)∧ X ∈ F,w ∈ [0, 1]}

return R, F ′

of R(y) are the suitable candidates to be the quality dimensions that distin-guishes the concept Cy from the other concepts. The score wy,X determinesthe importance of the selected feature X to represent the class label y. Fromthe conceptual space point of view, the scores indicating the weights show thesignificance of the chosen quality dimensions for Cy.

Algorithm 3.1 shows the steps for finding the ranked scored list of featuresfor each label y. The output of the algorithm is then a set of filter method resultsfor all the class labels, denoted by R = {R(y1), . . . ,R(ym)}. In this algorithm,F ′ is the set of all features that appeared (at least once) in the ranked features:

F ′ =⋃y∈Y

{X| ∃ (X,wy,X) ∈ R(y)∧ X ∈ F,wy,X ∈ [0, 1]}, (3.4)

where F ′ ⊂ F. The set of features F ′ is the set of potential features to becomequality dimensions. However, in feature grouping, some of these features maybe filtered out from the target set of quality dimensions.

Example 3.4. Continuing of the leaf data set in Example 3.1, suppose thatafter applying the MIFS method, elongation, lobedness, and roundness areselected as the top features for ytt, R(ytt) = {(Xel,wytt,Xel), (Xlo,wytt,Xlo),(Xro,wytt,Xro)}. Also, elongation, roundness, and indentation are selected foryno, R(yno) = {(Xel,wyno,Xel), (Xro,wyno,Xro), (Xin,wyno,Xin)}. Then, F ′ ={Xel,Xlo,Xro,Xin}.

3.1.2 Feature Subset Grouping

In order to specify a set of domains as subsets of integral quality dimensionsout of selected features, the method should partition features into a numberof subsets regarding their relevancy to the defined class labels. Based on the


Algorithm3.1:FeatureSubsetRanking

FunctionFeatureRanking(D,Y,F)R,F′←∅foreachy∈Ydo

//Define2-classdatasetDy={oi:(xi,yi)∈D|yi=y}

Dy={oi:(xi,yi)∈D|yi=y}

Dyy←Dy∪Dy

//Findthelistoftopscoredfeatures(X,wy,X)R(y)←MIFS(Dyy,F)F′←F′∪{X|∃(X,wy,X)∈R(y)∧X∈F,w∈[0,1]}

returnR,F′

ofR(y)arethesuitablecandidatestobethequalitydimensionsthatdistin-guishestheconceptCyfromtheotherconcepts.Thescorewy,XdeterminestheimportanceoftheselectedfeatureXtorepresenttheclasslabely.Fromtheconceptualspacepointofview,thescoresindicatingtheweightsshowthesignificanceofthechosenqualitydimensionsforCy.

Algorithm3.1showsthestepsforfindingtherankedscoredlistoffeaturesforeachlabely.Theoutputofthealgorithmisthenasetoffiltermethodresultsforalltheclasslabels,denotedbyR={R(y1),...,R(ym)}.Inthisalgorithm,F′isthesetofallfeaturesthatappeared(atleastonce)intherankedfeatures:

F′=⋃y∈Y

{X|∃(X,wy,X)∈R(y)∧X∈F,wy,X∈[0,1]},(3.4)

whereF′⊂F.ThesetoffeaturesF′isthesetofpotentialfeaturestobecomequalitydimensions.However,infeaturegrouping,someofthesefeaturesmaybefilteredoutfromthetargetsetofqualitydimensions.

Example3.4.ContinuingoftheleafdatasetinExample3.1,supposethatafterapplyingtheMIFSmethod,elongation,lobedness,androundnessareselectedasthetopfeaturesforytt,R(ytt)={(Xel,wytt,Xel),(Xlo,wytt,Xlo),(Xro,wytt,Xro)}.Also,elongation,roundness,andindentationareselectedforyno,R(yno)={(Xel,wyno,Xel),(Xro,wyno,Xro),(Xin,wyno,Xin)}.Then,F′={Xel,Xlo,Xro,Xin}.

3.1.2FeatureSubsetGrouping

Inordertospecifyasetofdomainsassubsetsofintegralqualitydimensionsoutofselectedfeatures,themethodshouldpartitionfeaturesintoanumberofsubsetsregardingtheirrelevancytothedefinedclasslabels.Basedonthe


definitions in conceptual space theory, a quality dimension usually appears ina single domain along other relevant dimensions to represent a specific aspectof conceptualised observations [86,242]. It might be possible to have the samedimension in various domains, but this requires a priori knowledge [31]. More-over, repeating dimensions in a multi-domain space increases the redundancy,and consequently decreases the accuracy of learning concepts. Therefore, theselected features are divided into distinct partitions of features as target do-mains, to avoid either creating a single domain of full features or repeatingfeatures in all of the constructed domains.

Here, a heuristic method is proposed to detect distinct subsets of features,where the features in each subset are highly representative of the most relevantclasses. The output set R in Algorithm 3.1 is a set of ranked features for eachlabel. It is obvious that some features might be repeated in the ranked set ofdifferent class labels in R. From the information in the set R, the goal is toextract those subsets of features that are associated with each other based ontheir co-appearance in the ranked features of each class. This section first intro-duces a graph representation of the label-feature relation, and then it explainshow to derive the correlated features using a greedy search on this graph. Morespecifically, this approach proposes to build up a bipartite graph and search forthe bicliques that identify the most associated feature subsets (i.e., domains).

Let G = (VY ∪ VF ′ ,E,w) be a bipartite graph with two sets of verticesVY and VF ′ , a set of edges E, and w : VY × VF ′ → IR as a weight functionfor the edges. The vertex set VY denotes the class labels in Y. The vertex setVF ′ denotes the top-ranked features in F ′. A vertex vy ∈ VY is connected toa vertex vX ∈ VF ′ if X ∈ F ′ has been selected for y ∈ Y in Algorithm 3.1. Inother words, for each pair (X,wy,X) ∈ R(y) a new edge vyvX is added to theedge set E of bipartite graph G between vertices vy and vX, where the weightof the edge vyvX is denoted by w(vyvX) = wy,X. Figure 3.3 is an illustration ofsuch weighted bipartite graph G.

The idea of grouping features is to find the maximal connected subgraphs inG. More precisely, a subset of features which are all connected to the same set ofclasses is a suitable subset of features for feature grouping. A biclique G ⊂ G isa special bipartite graph where every vertex in one part of vertices is connectedto all the vertices in the other part of the vertices. The highlighted edges inFigure 3.3 depicts an example of a biclique in the given bipartite graph. Let Gbe a biclique denoted by G = (VY ∪ VF ′ , E,w), where VY ⊂ VY, VF ′ ⊂ VF ′ .In this biclique, assume |VY| = m, |VF ′ | = n, thus |E| = m × n. The proposedapproach is looking for a biclique with the highest score as Gmax among all thebicliques in G. The score of a biclique G is calculated using a scoring functionScoreG, based on the weights of its edges, as follows:

ScoreG =∑

vy∈VY

( ∏vX∈VF ′

w(vyvX))/ n (3.5)


definitionsinconceptualspacetheory,aqualitydimensionusuallyappearsinasingledomainalongotherrelevantdimensionstorepresentaspecificaspectofconceptualisedobservations[86,242].Itmightbepossibletohavethesamedimensioninvariousdomains,butthisrequiresaprioriknowledge[31].More-over,repeatingdimensionsinamulti-domainspaceincreasestheredundancy,andconsequentlydecreasestheaccuracyoflearningconcepts.Therefore,theselectedfeaturesaredividedintodistinctpartitionsoffeaturesastargetdo-mains,toavoideithercreatingasingledomainoffullfeaturesorrepeatingfeaturesinalloftheconstructeddomains.

Here,aheuristicmethodisproposedtodetectdistinctsubsetsoffeatures,wherethefeaturesineachsubsetarehighlyrepresentativeofthemostrelevantclasses.TheoutputsetRinAlgorithm3.1isasetofrankedfeaturesforeachlabel.ItisobviousthatsomefeaturesmightberepeatedintherankedsetofdifferentclasslabelsinR.FromtheinformationinthesetR,thegoalistoextractthosesubsetsoffeaturesthatareassociatedwitheachotherbasedontheirco-appearanceintherankedfeaturesofeachclass.Thissectionfirstintro-ducesagraphrepresentationofthelabel-featurerelation,andthenitexplainshowtoderivethecorrelatedfeaturesusingagreedysearchonthisgraph.Morespecifically,thisapproachproposestobuildupabipartitegraphandsearchforthebicliquesthatidentifythemostassociatedfeaturesubsets(i.e.,domains).

LetG=(VY∪VF′,E,w)beabipartitegraphwithtwosetsofverticesVYandVF′,asetofedgesE,andw:VY×VF′→IRasaweightfunctionfortheedges.ThevertexsetVYdenotestheclasslabelsinY.ThevertexsetVF′denotesthetop-rankedfeaturesinF′.Avertexvy∈VYisconnectedtoavertexvX∈VF′ifX∈F′hasbeenselectedfory∈YinAlgorithm3.1.Inotherwords,foreachpair(X,wy,X)∈R(y)anewedgevyvXisaddedtotheedgesetEofbipartitegraphGbetweenverticesvyandvX,wheretheweightoftheedgevyvXisdenotedbyw(vyvX)=wy,X.Figure3.3isanillustrationofsuchweightedbipartitegraphG.

TheideaofgroupingfeaturesistofindthemaximalconnectedsubgraphsinG.Moreprecisely,asubsetoffeatureswhichareallconnectedtothesamesetofclassesisasuitablesubsetoffeaturesforfeaturegrouping.AbicliqueG⊂Gisaspecialbipartitegraphwhereeveryvertexinonepartofverticesisconnectedtoalltheverticesintheotherpartofthevertices.ThehighlightededgesinFigure3.3depictsanexampleofabicliqueinthegivenbipartitegraph.LetGbeabicliquedenotedbyG=(VY∪VF′,E,w),whereVY⊂VY,VF′⊂VF′.Inthisbiclique,assume|VY|=m,|VF′|=n,thus|E|=m×n.TheproposedapproachislookingforabicliquewiththehighestscoreasGmaxamongallthebicliquesinG.ThescoreofabicliqueGiscalculatedusingascoringfunctionScoreG,basedontheweightsofitsedges,asfollows:

ScoreG=∑vy∈VY

(∏vX∈VF′

w(vyvX))/n(3.5)


definitions in conceptual space theory, a quality dimension usually appears ina single domain along other relevant dimensions to represent a specific aspectof conceptualised observations [86,242]. It might be possible to have the samedimension in various domains, but this requires a priori knowledge [31]. More-over, repeating dimensions in a multi-domain space increases the redundancy,and consequently decreases the accuracy of learning concepts. Therefore, theselected features are divided into distinct partitions of features as target do-mains, to avoid either creating a single domain of full features or repeatingfeatures in all of the constructed domains.

Here, a heuristic method is proposed to detect distinct subsets of features,where the features in each subset are highly representative of the most relevantclasses. The output set R in Algorithm 3.1 is a set of ranked features for eachlabel. It is obvious that some features might be repeated in the ranked set ofdifferent class labels in R. From the information in the set R, the goal is toextract those subsets of features that are associated with each other based ontheir co-appearance in the ranked features of each class. This section first intro-duces a graph representation of the label-feature relation, and then it explainshow to derive the correlated features using a greedy search on this graph. Morespecifically, this approach proposes to build up a bipartite graph and search forthe bicliques that identify the most associated feature subsets (i.e., domains).

Let G = (VY ∪ VF ′ ,E,w) be a bipartite graph with two sets of verticesVY and VF ′ , a set of edges E, and w : VY × VF ′ → IR as a weight functionfor the edges. The vertex set VY denotes the class labels in Y. The vertex setVF ′ denotes the top-ranked features in F ′. A vertex vy ∈ VY is connected toa vertex vX ∈ VF ′ if X ∈ F ′ has been selected for y ∈ Y in Algorithm 3.1. Inother words, for each pair (X,wy,X) ∈ R(y) a new edge vyvX is added to theedge set E of bipartite graph G between vertices vy and vX, where the weightof the edge vyvX is denoted by w(vyvX) = wy,X. Figure 3.3 is an illustration ofsuch weighted bipartite graph G.

The idea of grouping features is to find the maximal connected subgraphs inG. More precisely, a subset of features which are all connected to the same set ofclasses is a suitable subset of features for feature grouping. A biclique G ⊂ G isa special bipartite graph where every vertex in one part of vertices is connectedto all the vertices in the other part of the vertices. The highlighted edges inFigure 3.3 depicts an example of a biclique in the given bipartite graph. Let Gbe a biclique denoted by G = (VY ∪ VF ′ , E,w), where VY ⊂ VY, VF ′ ⊂ VF ′ .In this biclique, assume |VY| = m, |VF ′ | = n, thus |E| = m × n. The proposedapproach is looking for a biclique with the highest score as Gmax among all thebicliques in G. The score of a biclique G is calculated using a scoring functionScoreG, based on the weights of its edges, as follows:

ScoreG =∑

vy∈VY

( ∏vX∈VF ′

w(vyvX))/ n (3.5)


definitionsinconceptualspacetheory,aqualitydimensionusuallyappearsinasingledomainalongotherrelevantdimensionstorepresentaspecificaspectofconceptualisedobservations[86,242].Itmightbepossibletohavethesamedimensioninvariousdomains,butthisrequiresaprioriknowledge[31].More-over,repeatingdimensionsinamulti-domainspaceincreasestheredundancy,andconsequentlydecreasestheaccuracyoflearningconcepts.Therefore,theselectedfeaturesaredividedintodistinctpartitionsoffeaturesastargetdo-mains,toavoideithercreatingasingledomainoffullfeaturesorrepeatingfeaturesinalloftheconstructeddomains.

Here,aheuristicmethodisproposedtodetectdistinctsubsetsoffeatures,wherethefeaturesineachsubsetarehighlyrepresentativeofthemostrelevantclasses.TheoutputsetRinAlgorithm3.1isasetofrankedfeaturesforeachlabel.ItisobviousthatsomefeaturesmightberepeatedintherankedsetofdifferentclasslabelsinR.FromtheinformationinthesetR,thegoalistoextractthosesubsetsoffeaturesthatareassociatedwitheachotherbasedontheirco-appearanceintherankedfeaturesofeachclass.Thissectionfirstintro-ducesagraphrepresentationofthelabel-featurerelation,andthenitexplainshowtoderivethecorrelatedfeaturesusingagreedysearchonthisgraph.Morespecifically,thisapproachproposestobuildupabipartitegraphandsearchforthebicliquesthatidentifythemostassociatedfeaturesubsets(i.e.,domains).

LetG=(VY∪VF′,E,w)beabipartitegraphwithtwosetsofverticesVYandVF′,asetofedgesE,andw:VY×VF′→IRasaweightfunctionfortheedges.ThevertexsetVYdenotestheclasslabelsinY.ThevertexsetVF′denotesthetop-rankedfeaturesinF′.Avertexvy∈VYisconnectedtoavertexvX∈VF′ifX∈F′hasbeenselectedfory∈YinAlgorithm3.1.Inotherwords,foreachpair(X,wy,X)∈R(y)anewedgevyvXisaddedtotheedgesetEofbipartitegraphGbetweenverticesvyandvX,wheretheweightoftheedgevyvXisdenotedbyw(vyvX)=wy,X.Figure3.3isanillustrationofsuchweightedbipartitegraphG.

TheideaofgroupingfeaturesistofindthemaximalconnectedsubgraphsinG.Moreprecisely,asubsetoffeatureswhichareallconnectedtothesamesetofclassesisasuitablesubsetoffeaturesforfeaturegrouping.AbicliqueG⊂Gisaspecialbipartitegraphwhereeveryvertexinonepartofverticesisconnectedtoalltheverticesintheotherpartofthevertices.ThehighlightededgesinFigure3.3depictsanexampleofabicliqueinthegivenbipartitegraph.LetGbeabicliquedenotedbyG=(VY∪VF′,E,w),whereVY⊂VY,VF′⊂VF′.Inthisbiclique,assume|VY|=m,|VF′|=n,thus|E|=m×n.TheproposedapproachislookingforabicliquewiththehighestscoreasGmaxamongallthebicliquesinG.ThescoreofabicliqueGiscalculatedusingascoringfunctionScoreG,basedontheweightsofitsedges,asfollows:

ScoreG=∑vy∈VY

(∏vX∈VF′

w(vyvX))/n(3.5)


y1 y2 yi ym

X1 X2 X3 Xj Xn

. . . . . .

. . . . . .

wy1,X1wyi,Xj wym,Xn

Figure 3.3: A weighted bipartite graph with two sets of vertices from the labelsY and the selected features F ′. Also, an example of a biclique shown in thehighlighted edges.

This scoring function calculates the average of the weights associated withthe biclique. This scoring function will return higher values for the bicliqueswith the higher number of class labels, and lower number of features4.

In the selected biclique (Gmax), the involved features VF ′ then will be thesubset of features as the quality dimensions of a domain δ. To identify thenext domain, the set of features VF ′ is eliminated from the graph G since thesefeatures are already assigned to a domain. After that, the process of finding thebest biclique repeats on the updated graph G to find the next maximal biclique.Algorithm 3.2 shows the steps of determining the domains with feature subsetgrouping.

As stated in Definition 3.3, a domain δ is a triple 〈 Q(δ),C(δ),ωδ 〉. Theweight function ωδ = C(δ) × Q(δ) → IR is a function presenting the as-signed salient weight between a concept Cy ∈ Y(δ) and a quality dimensionqX ∈ Q(δ). For a chosen biclique Gmax = (VY ∪ VF ′ , E,w), a domain δ canbe constructed as a triple 〈 Q(δ),C(δ),ωδ 〉 = 〈 VF ′ , VY,w 〉. More specifically,for a constructed domain δ = 〈 Q(δ),C(δ),ωδ 〉, the set of quality dimensionsis Q(δ) = {qX | vX ∈ VF ′ }. Also, the set of concepts related to δ is defined asC(δ) = {Cy | vy ∈ VY}. Then ωδ(Cy,qX) = w(vyvX).

It is worth noting that for the next iteration of finding bicliques, the ver-tices with the labels VY of a chosen biclique are not eliminated while updatingthe bipartite graph, because a class label can be involved in other bicliques infurther iterations. From a conceptual space point of view, it is also meaningful,since a concept can be represented in several domains.

Example 3.5. The corresponding bipartite graph to the ranked features fromExample 3.4 is illustrated in Figure 3.4. In this bipartite graph, one biclique ishighlighted, which can potentially be the best biclique. If so, then the featureselongation and roundness will become the quality dimensions of a new domainδ as: Q(δ) = {qel, qro} and C(δ) = {Ctt,Cno}.

4Assume that all the weights of the edges in a biclique G are equal to a constant weight wc.Then Score

G= m

n (wc)n.


y1y2yiym

X1X2X3XjXn

......

......

wy1,X1wyi,Xjwym,Xn

Figure3.3:AweightedbipartitegraphwithtwosetsofverticesfromthelabelsYandtheselectedfeaturesF′.Also,anexampleofabicliqueshowninthehighlightededges.

Thisscoringfunctioncalculatestheaverageoftheweightsassociatedwiththebiclique.Thisscoringfunctionwillreturnhighervaluesforthebicliqueswiththehighernumberofclasslabels,andlowernumberoffeatures4.

Intheselectedbiclique(Gmax),theinvolvedfeaturesVF′thenwillbethesubsetoffeaturesasthequalitydimensionsofadomainδ.Toidentifythenextdomain,thesetoffeaturesVF′iseliminatedfromthegraphGsincethesefeaturesarealreadyassignedtoadomain.Afterthat,theprocessoffindingthebestbicliquerepeatsontheupdatedgraphGtofindthenextmaximalbiclique.Algorithm3.2showsthestepsofdeterminingthedomainswithfeaturesubsetgrouping.

AsstatedinDefinition3.3,adomainδisatriple〈Q(δ),C(δ),ωδ〉.Theweightfunctionωδ=C(δ)×Q(δ)→IRisafunctionpresentingtheas-signedsalientweightbetweenaconceptCy∈Y(δ)andaqualitydimensionqX∈Q(δ).ForachosenbicliqueGmax=(VY∪VF′,E,w),adomainδcanbeconstructedasatriple〈Q(δ),C(δ),ωδ〉=〈VF′,VY,w〉.Morespecifically,foraconstructeddomainδ=〈Q(δ),C(δ),ωδ〉,thesetofqualitydimensionsisQ(δ)={qX|vX∈VF′}.Also,thesetofconceptsrelatedtoδisdefinedasC(δ)={Cy|vy∈VY}.Thenωδ(Cy,qX)=w(vyvX).

Itisworthnotingthatforthenextiterationoffindingbicliques,thever-ticeswiththelabelsVYofachosenbicliquearenoteliminatedwhileupdatingthebipartitegraph,becauseaclasslabelcanbeinvolvedinotherbicliquesinfurtheriterations.Fromaconceptualspacepointofview,itisalsomeaningful,sinceaconceptcanberepresentedinseveraldomains.

Example3.5.ThecorrespondingbipartitegraphtotherankedfeaturesfromExample3.4isillustratedinFigure3.4.Inthisbipartitegraph,onebicliqueishighlighted,whichcanpotentiallybethebestbiclique.Ifso,thenthefeatureselongationandroundnesswillbecomethequalitydimensionsofanewdomainδas:Q(δ)={qel,qro}andC(δ)={Ctt,Cno}.

4AssumethatalltheweightsoftheedgesinabicliqueGareequaltoaconstantweightwc.ThenScoreG=

mn(wc)

n.


y1 y2 yi ym

X1 X2 X3 Xj Xn

. . . . . .

. . . . . .

wy1,X1wyi,Xj wym,Xn

Figure 3.3: A weighted bipartite graph with two sets of vertices from the labelsY and the selected features F ′. Also, an example of a biclique shown in thehighlighted edges.

This scoring function calculates the average of the weights associated withthe biclique. This scoring function will return higher values for the bicliqueswith the higher number of class labels, and lower number of features4.

In the selected biclique (Gmax), the involved features VF ′ then will be thesubset of features as the quality dimensions of a domain δ. To identify thenext domain, the set of features VF ′ is eliminated from the graph G since thesefeatures are already assigned to a domain. After that, the process of finding thebest biclique repeats on the updated graph G to find the next maximal biclique.Algorithm 3.2 shows the steps of determining the domains with feature subsetgrouping.

As stated in Definition 3.3, a domain δ is a triple 〈 Q(δ),C(δ),ωδ 〉. Theweight function ωδ = C(δ) × Q(δ) → IR is a function presenting the as-signed salient weight between a concept Cy ∈ Y(δ) and a quality dimensionqX ∈ Q(δ). For a chosen biclique Gmax = (VY ∪ VF ′ , E,w), a domain δ canbe constructed as a triple 〈 Q(δ),C(δ),ωδ 〉 = 〈 VF ′ , VY,w 〉. More specifically,for a constructed domain δ = 〈 Q(δ),C(δ),ωδ 〉, the set of quality dimensionsis Q(δ) = {qX | vX ∈ VF ′ }. Also, the set of concepts related to δ is defined asC(δ) = {Cy | vy ∈ VY}. Then ωδ(Cy,qX) = w(vyvX).

It is worth noting that for the next iteration of finding bicliques, the ver-tices with the labels VY of a chosen biclique are not eliminated while updatingthe bipartite graph, because a class label can be involved in other bicliques infurther iterations. From a conceptual space point of view, it is also meaningful,since a concept can be represented in several domains.

Example 3.5. The corresponding bipartite graph to the ranked features fromExample 3.4 is illustrated in Figure 3.4. In this bipartite graph, one biclique ishighlighted, which can potentially be the best biclique. If so, then the featureselongation and roundness will become the quality dimensions of a new domainδ as: Q(δ) = {qel, qro} and C(δ) = {Ctt,Cno}.

4Assume that all the weights of the edges in a biclique G are equal to a constant weight wc.Then Score

G= m

n (wc)n.


y1y2yiym

X1X2X3XjXn

......

......

wy1,X1wyi,Xjwym,Xn

Figure3.3:AweightedbipartitegraphwithtwosetsofverticesfromthelabelsYandtheselectedfeaturesF′.Also,anexampleofabicliqueshowninthehighlightededges.

Thisscoringfunctioncalculatestheaverageoftheweightsassociatedwiththebiclique.Thisscoringfunctionwillreturnhighervaluesforthebicliqueswiththehighernumberofclasslabels,andlowernumberoffeatures4.

Intheselectedbiclique(Gmax),theinvolvedfeaturesVF′thenwillbethesubsetoffeaturesasthequalitydimensionsofadomainδ.Toidentifythenextdomain,thesetoffeaturesVF′iseliminatedfromthegraphGsincethesefeaturesarealreadyassignedtoadomain.Afterthat,theprocessoffindingthebestbicliquerepeatsontheupdatedgraphGtofindthenextmaximalbiclique.Algorithm3.2showsthestepsofdeterminingthedomainswithfeaturesubsetgrouping.

AsstatedinDefinition3.3,adomainδisatriple〈Q(δ),C(δ),ωδ〉.Theweightfunctionωδ=C(δ)×Q(δ)→IRisafunctionpresentingtheas-signedsalientweightbetweenaconceptCy∈Y(δ)andaqualitydimensionqX∈Q(δ).ForachosenbicliqueGmax=(VY∪VF′,E,w),adomainδcanbeconstructedasatriple〈Q(δ),C(δ),ωδ〉=〈VF′,VY,w〉.Morespecifically,foraconstructeddomainδ=〈Q(δ),C(δ),ωδ〉,thesetofqualitydimensionsisQ(δ)={qX|vX∈VF′}.Also,thesetofconceptsrelatedtoδisdefinedasC(δ)={Cy|vy∈VY}.Thenωδ(Cy,qX)=w(vyvX).

Itisworthnotingthatforthenextiterationoffindingbicliques,thever-ticeswiththelabelsVYofachosenbicliquearenoteliminatedwhileupdatingthebipartitegraph,becauseaclasslabelcanbeinvolvedinotherbicliquesinfurtheriterations.Fromaconceptualspacepointofview,itisalsomeaningful,sinceaconceptcanberepresentedinseveraldomains.

Example3.5.ThecorrespondingbipartitegraphtotherankedfeaturesfromExample3.4isillustratedinFigure3.4.Inthisbipartitegraph,onebicliqueishighlighted,whichcanpotentiallybethebestbiclique.Ifso,thenthefeatureselongationandroundnesswillbecomethequalitydimensionsofanewdomainδas:Q(δ)={qel,qro}andC(δ)={Ctt,Cno}.

4AssumethatalltheweightsoftheedgesinabicliqueGareequaltoaconstantweightwc.ThenScoreG=

mn(wc)

n.


Algorithm 3.2: Feature Subset Grouping

Function FeatureGrouping(R, F ′, Y)Δ,Q ← ∅// Build bipartite graphVY ← Y

VF ′ ← F ′

foreach (X,wy,X) ∈ R(y) : y ∈ Y doE ← E ∪ vyvXw(vyvX) ← wy,X

G = (VY ∪ VF ′ ,E,w)// Find max cliques as domainsdo

Gmax(VY ∪ VF ′ , E,w) ← MaxBiclique(G)

if Gmax = ∅ thenbreak

δ : 〈 Q(δ),C(δ),ωδ 〉 ← 〈VF ′ , VY,w〉Δ ← Δ ∪ δ

Q ← Q ∪ Q(δ)// Update bipartite graphG ← (

VY ∪ (VF ′\VF ′), (E\E),w)

until (Q = F ′)return Δ, Q

So far, a methodology to extract domains including their distinct qualitydimensions out of a given data set of observations and features is introduced.Now, each class or label of the observations needs to be represented as conceptsin the proposed conceptual space.

3.2 Concept Representation: An Instance-based

Approach

The vital concern to represent a concept in a conceptual space is to decidewhich are the most relevant quality dimensions and consequently most relevantdomains. A concept may be represented in one domain or several domains. Animportant point is that a concept is not necessarily associated with a certainsubset of domains5, but usually, one domain or a few numbers of domains areprominent to represent a concept [86]. In Section 3.1, the selected features aregrouped into a set of domains out of the extracted bicliques. Using the fact

5For example, the concept of ‘apple’ is mainly represented by colour, texture, and shape do-mains. But it can be associated with further biological features and domains.


Algorithm3.2:FeatureSubsetGrouping

FunctionFeatureGrouping(R,F′,Y)Δ,Q←∅//BuildbipartitegraphVY←Y

VF′←F′

foreach(X,wy,X)∈R(y):y∈YdoE←E∪vyvXw(vyvX)←wy,X

G=(VY∪VF′,E,w)//Findmaxcliquesasdomains

do

Gmax(VY∪VF′,E,w)←MaxBiclique(G)

ifGmax=∅thenbreak

δ:〈Q(δ),C(δ),ωδ〉←〈VF′,VY,w〉Δ←Δ∪δ

Q←Q∪Q(δ)//UpdatebipartitegraphG←(VY∪(VF′\VF′),(E\E),w)

until(Q=F′)returnΔ,Q

Sofar,amethodologytoextractdomainsincludingtheirdistinctqualitydimensionsoutofagivendatasetofobservationsandfeaturesisintroduced.Now,eachclassorlabeloftheobservationsneedstoberepresentedasconceptsintheproposedconceptualspace.

3.2ConceptRepresentation:AnInstance-based

Approach

Thevitalconcerntorepresentaconceptinaconceptualspaceistodecidewhicharethemostrelevantqualitydimensionsandconsequentlymostrelevantdomains.Aconceptmayberepresentedinonedomainorseveraldomains.Animportantpointisthataconceptisnotnecessarilyassociatedwithacertainsubsetofdomains5,butusually,onedomainorafewnumbersofdomainsareprominenttorepresentaconcept[86].InSection3.1,theselectedfeaturesaregroupedintoasetofdomainsoutoftheextractedbicliques.Usingthefact

5Forexample,theconceptof‘apple’ismainlyrepresentedbycolour,texture,andshapedo-mains.Butitcanbeassociatedwithfurtherbiologicalfeaturesanddomains.


Algorithm 3.2: Feature Subset Grouping

Function FeatureGrouping(R, F ′, Y)Δ,Q ← ∅// Build bipartite graphVY ← Y

VF ′ ← F ′

foreach (X,wy,X) ∈ R(y) : y ∈ Y doE ← E ∪ vyvXw(vyvX) ← wy,X

G = (VY ∪ VF ′ ,E,w)// Find max cliques as domainsdo

Gmax(VY ∪ VF ′ , E,w) ← MaxBiclique(G)

if Gmax = ∅ thenbreak

δ : 〈 Q(δ),C(δ),ωδ 〉 ← 〈VF ′ , VY,w〉Δ ← Δ ∪ δ

Q ← Q ∪ Q(δ)// Update bipartite graphG ← (

VY ∪ (VF ′\VF ′), (E\E),w)

until (Q = F ′)return Δ, Q

So far, a methodology to extract domains including their distinct qualitydimensions out of a given data set of observations and features is introduced.Now, each class or label of the observations needs to be represented as conceptsin the proposed conceptual space.

3.2 Concept Representation: An Instance-based

Approach

The vital concern to represent a concept in a conceptual space is to decidewhich are the most relevant quality dimensions and consequently most relevantdomains. A concept may be represented in one domain or several domains. Animportant point is that a concept is not necessarily associated with a certainsubset of domains5, but usually, one domain or a few numbers of domains areprominent to represent a concept [86]. In Section 3.1, the selected features aregrouped into a set of domains out of the extracted bicliques. Using the fact

5For example, the concept of ‘apple’ is mainly represented by colour, texture, and shape do-mains. But it can be associated with further biological features and domains.


Algorithm3.2:FeatureSubsetGrouping

FunctionFeatureGrouping(R,F′,Y)Δ,Q←∅//BuildbipartitegraphVY←Y

VF′←F′

foreach(X,wy,X)∈R(y):y∈YdoE←E∪vyvXw(vyvX)←wy,X

G=(VY∪VF′,E,w)//Findmaxcliquesasdomains

do

Gmax(VY∪VF′,E,w)←MaxBiclique(G)

ifGmax=∅thenbreak

δ:〈Q(δ),C(δ),ωδ〉←〈VF′,VY,w〉Δ←Δ∪δ

Q←Q∪Q(δ)//UpdatebipartitegraphG←(VY∪(VF′\VF′),(E\E),w)

until(Q=F′)returnΔ,Q

Sofar,amethodologytoextractdomainsincludingtheirdistinctqualitydimensionsoutofagivendatasetofobservationsandfeaturesisintroduced.Now,eachclassorlabeloftheobservationsneedstoberepresentedasconceptsintheproposedconceptualspace.

3.2ConceptRepresentation:AnInstance-based

Approach

Thevitalconcerntorepresentaconceptinaconceptualspaceistodecidewhicharethemostrelevantqualitydimensionsandconsequentlymostrelevantdomains.Aconceptmayberepresentedinonedomainorseveraldomains.Animportantpointisthataconceptisnotnecessarilyassociatedwithacertainsubsetofdomains5,butusually,onedomainorafewnumbersofdomainsareprominenttorepresentaconcept[86].InSection3.1,theselectedfeaturesaregroupedintoasetofdomainsoutoftheextractedbicliques.Usingthefact

5Forexample,theconceptof‘apple’ismainlyrepresentedbycolour,texture,andshapedo-mains.Butitcanbeassociatedwithfurtherbiologicalfeaturesanddomains.

3.2. CONCEPT REPRESENTATION 45

ytt yno

Xin Xel Xlo Xro

Figure 3.4: A bigraph graph and one selected biclique (blue edges) for the leafexample (explained in Example 3.5).

that each class label will be involved in at least one selected biclique, then thecorresponding concept is assigned to (or associated with) a certain number ofdomains (at least one). With the output of Algorithm 3.2 it is already knownwhich concepts are associated with which domains.

Example 3.6. For the selected biclique G in Figure 3.4, the concepts Ctt andCno are associated with a generated domain δi including the quality dimensionsqel and qro.

It is possible that one concept appears in two bicliques, which means theconcept is relevant to both specified domains. This fact is consistent with theconceptual spaces theory since a concept is not always represented within a sin-gle domain. The common example is the concept of ‘apple’ which is representedwith more than a single domain, such as colour, taste, size, etc. A concept withmerely one related domain is called property [86]. In fact, a property is a spe-cial form of a concept defined in a single domain [91]. For example, the colour‘green’ is a property which is represented only in the colour domain. Thus, aconcept can be specified as a single property within a single domain (e.g., green),or as a collection of properties within several domains (e.g., apple). Dependingon the domain specification process, a class label in the input data set mightbe represented either as a property in one domain or as a concept in severaldomains. The problem of deriving the domains in a data-driven manner is thatfor a concept represented in several domains, there is no trivial interpretationfor the meaning of its properties within the domains. This issue comes from thefact that the interpretation of the data-driven domains themselves is also tricky.

For a set of concepts C = {Cy1 , . . . ,Cyn}, the problem is how to formulate

the geometrical representation of concepts in the conceptual space with theextracted set of domains Δ. In general, a natural concept is a collection ofregions across one or more domains along with a set of salient weights to thedomains [86]. For a concept Cy, let Δ(y) ∈ Δ be a subset of domains thatcontain concept Cy in their concept sets, as Δ(y) = {δi| δi ∈ Δ ∧ Cy ∈ C(δi)},assuming that |Δ(y)| = k. The concept Cy is presented by a collection of sub-

3.2.CONCEPTREPRESENTATION45

yttyno

XinXelXloXro

Figure3.4:Abigraphgraphandoneselectedbiclique(blueedges)fortheleafexample(explainedinExample3.5).

thateachclasslabelwillbeinvolvedinatleastoneselectedbiclique,thenthecorrespondingconceptisassignedto(orassociatedwith)acertainnumberofdomains(atleastone).WiththeoutputofAlgorithm3.2itisalreadyknownwhichconceptsareassociatedwithwhichdomains.

Example3.6.FortheselectedbicliqueGinFigure3.4,theconceptsCttandCnoareassociatedwithagenerateddomainδiincludingthequalitydimensionsqelandqro.

Itispossiblethatoneconceptappearsintwobicliques,whichmeanstheconceptisrelevanttobothspecifieddomains.Thisfactisconsistentwiththeconceptualspacestheorysinceaconceptisnotalwaysrepresentedwithinasin-gledomain.Thecommonexampleistheconceptof‘apple’whichisrepresentedwithmorethanasingledomain,suchascolour,taste,size,etc.Aconceptwithmerelyonerelateddomainiscalledproperty[86].Infact,apropertyisaspe-cialformofaconceptdefinedinasingledomain[91].Forexample,thecolour‘green’isapropertywhichisrepresentedonlyinthecolourdomain.Thus,aconceptcanbespecifiedasasinglepropertywithinasingledomain(e.g.,green),orasacollectionofpropertieswithinseveraldomains(e.g.,apple).Dependingonthedomainspecificationprocess,aclasslabelintheinputdatasetmightberepresentedeitherasapropertyinonedomainorasaconceptinseveraldomains.Theproblemofderivingthedomainsinadata-drivenmanneristhatforaconceptrepresentedinseveraldomains,thereisnotrivialinterpretationforthemeaningofitspropertieswithinthedomains.Thisissuecomesfromthefactthattheinterpretationofthedata-drivendomainsthemselvesisalsotricky.

ForasetofconceptsC={Cy1,...,Cyn},theproblemishowtoformulatethegeometricalrepresentationofconceptsintheconceptualspacewiththeextractedsetofdomainsΔ.Ingeneral,anaturalconceptisacollectionofregionsacrossoneormoredomainsalongwithasetofsalientweightstothedomains[86].ForaconceptCy,letΔ(y)∈ΔbeasubsetofdomainsthatcontainconceptCyintheirconceptsets,asΔ(y)={δi|δi∈Δ∧Cy∈C(δi)},assumingthat|Δ(y)|=k.TheconceptCyispresentedbyacollectionofsub-


ytt yno

Xin Xel Xlo Xro

Figure 3.4: A bigraph graph and one selected biclique (blue edges) for the leafexample (explained in Example 3.5).

that each class label will be involved in at least one selected biclique, then thecorresponding concept is assigned to (or associated with) a certain number ofdomains (at least one). With the output of Algorithm 3.2 it is already knownwhich concepts are associated with which domains.

Example 3.6. For the selected biclique G in Figure 3.4, the concepts Ctt andCno are associated with a generated domain δi including the quality dimensionsqel and qro.

It is possible that one concept appears in two bicliques, which means theconcept is relevant to both specified domains. This fact is consistent with theconceptual spaces theory since a concept is not always represented within a sin-gle domain. The common example is the concept of ‘apple’ which is representedwith more than a single domain, such as colour, taste, size, etc. A concept withmerely one related domain is called property [86]. In fact, a property is a spe-cial form of a concept defined in a single domain [91]. For example, the colour‘green’ is a property which is represented only in the colour domain. Thus, aconcept can be specified as a single property within a single domain (e.g., green),or as a collection of properties within several domains (e.g., apple). Dependingon the domain specification process, a class label in the input data set mightbe represented either as a property in one domain or as a concept in severaldomains. The problem of deriving the domains in a data-driven manner is thatfor a concept represented in several domains, there is no trivial interpretationfor the meaning of its properties within the domains. This issue comes from thefact that the interpretation of the data-driven domains themselves is also tricky.

For a set of concepts C = {Cy1 , . . . ,Cyn}, the problem is how to formulate

the geometrical representation of concepts in the conceptual space with theextracted set of domains Δ. In general, a natural concept is a collection ofregions across one or more domains along with a set of salient weights to thedomains [86]. For a concept Cy, let Δ(y) ∈ Δ be a subset of domains thatcontain concept Cy in their concept sets, as Δ(y) = {δi| δi ∈ Δ ∧ Cy ∈ C(δi)},assuming that |Δ(y)| = k. The concept Cy is presented by a collection of sub-


yttyno

XinXelXloXro

Figure3.4:Abigraphgraphandoneselectedbiclique(blueedges)fortheleafexample(explainedinExample3.5).

thateachclasslabelwillbeinvolvedinatleastoneselectedbiclique,thenthecorrespondingconceptisassignedto(orassociatedwith)acertainnumberofdomains(atleastone).WiththeoutputofAlgorithm3.2itisalreadyknownwhichconceptsareassociatedwithwhichdomains.

Example3.6.FortheselectedbicliqueGinFigure3.4,theconceptsCttandCnoareassociatedwithagenerateddomainδiincludingthequalitydimensionsqelandqro.

Itispossiblethatoneconceptappearsintwobicliques,whichmeanstheconceptisrelevanttobothspecifieddomains.Thisfactisconsistentwiththeconceptualspacestheorysinceaconceptisnotalwaysrepresentedwithinasin-gledomain.Thecommonexampleistheconceptof‘apple’whichisrepresentedwithmorethanasingledomain,suchascolour,taste,size,etc.Aconceptwithmerelyonerelateddomainiscalledproperty[86].Infact,apropertyisaspe-cialformofaconceptdefinedinasingledomain[91].Forexample,thecolour‘green’isapropertywhichisrepresentedonlyinthecolourdomain.Thus,aconceptcanbespecifiedasasinglepropertywithinasingledomain(e.g.,green),orasacollectionofpropertieswithinseveraldomains(e.g.,apple).Dependingonthedomainspecificationprocess,aclasslabelintheinputdatasetmightberepresentedeitherasapropertyinonedomainorasaconceptinseveraldomains.Theproblemofderivingthedomainsinadata-drivenmanneristhatforaconceptrepresentedinseveraldomains,thereisnotrivialinterpretationforthemeaningofitspropertieswithinthedomains.Thisissuecomesfromthefactthattheinterpretationofthedata-drivendomainsthemselvesisalsotricky.

ForasetofconceptsC={Cy1,...,Cyn},theproblemishowtoformulatethegeometricalrepresentationofconceptsintheconceptualspacewiththeextractedsetofdomainsΔ.Ingeneral,anaturalconceptisacollectionofregionsacrossoneormoredomainsalongwithasetofsalientweightstothedomains[86].ForaconceptCy,letΔ(y)∈ΔbeasubsetofdomainsthatcontainconceptCyintheirconceptsets,asΔ(y)={δi|δi∈Δ∧Cy∈C(δi)},assumingthat|Δ(y)|=k.TheconceptCyispresentedbyacollectionofsub-


Algorithm 3.3: Concept Representation

Input: Dy ⊂ D: set of observations that are labelled by y ∈ Y andΔ(y) ⊆ Δ: domains that contain Cy in their concept sets.

Output: A Concept Cy = {c1y, . . . , cky}, representing label y in the

conceptual space.foreach o ∈ Dy do

γo ← Vectorise(o,Δ(y),Q) // determine p1γ, . . . ,pk

γ in δ1, ..., δkΓ(y) ← Γ(y) ∪ γo

foreach δi ∈ Δ(y) // δi = 〈Q(δi),C(δi),ωδi〉 , 1 � i � k

dociy ← ∅foreach γ ∈ Γ(y) do

Piy ← Pi

y ∪ {piγ}

η ← ConvexHull(Piy)

φ ← {ωδi(Cy,qi)|Cy ∈ C(δi),qi ∈ Q(δi)}

ciy ← 〈η,φ〉Cy ← Cy ∪ ciy

concepts, denoted: Cy = {c1y, . . . , cky}, where each ciy is the representation of

Cy within the domain δi ∈ Δ(y)6.

Definition 3.4. A sub-concept ciy, representing the concept Cy in the domainδi, is defined as a tuple 〈η,φ〉, where η is the region representing the geometricalarea of Cy in the domain δi, and φ is a set of weights indicating the assigneddegrees of salience between Cy and each quality dimension q ∈ Q(δi).

In order to represent a concept, the representation of its sub-concepts is de-fined. The following two sections describe the way to formally represent a con-cept, by defining its regions and its set of weights, respectively. Algorithm 3.3shows the steps of the concept representation, with the required parameters torepresent a concept Cy.

3.2.1 Convex Regions of Concepts

The identification of the geometrical regions of concepts is based on the loca-tion of the known observations. The concept Cy ∈ C is represented using thesubset of observations Dy = {o1,o2, . . . ,ony

} which are labelled with y ∈ Y.The set of instances Γ(y) is defined related to the observations in Dy, denoted

6 If a concept is associated with only one domain, then the representation of the concept is infact equivalent to the representation of its sub-concept in that domain.


Algorithm3.3:ConceptRepresentation

Input:Dy⊂D:setofobservationsthatarelabelledbyy∈YandΔ(y)⊆Δ:domainsthatcontainCyintheirconceptsets.

Output:AConceptCy={c1y,...,c

ky},representinglabelyinthe

conceptualspace.foreacho∈Dydo

γo←Vectorise(o,Δ(y),Q)//determinep1γ,...,p

kγinδ1,...,δk

Γ(y)←Γ(y)∪γo

foreachδi∈Δ(y)//δi=〈Q(δi),C(δi),ωδi〉,1�i�k

dociy←∅

foreachγ∈Γ(y)doPiy←P

iy∪{p

iγ}

η←ConvexHull(Piy)

φ←{ωδi(Cy,qi)|Cy∈C(δi),q

i∈Q(δi)}

ciy←〈η,φ〉Cy←Cy∪c

iy

concepts,denoted:Cy={c1y,...,c

ky},whereeachc

iyistherepresentationof

Cywithinthedomainδi∈Δ(y)6.

Definition3.4.Asub-conceptciy,representingtheconceptCyinthedomain

δi,isdefinedasatuple〈η,φ〉,whereηistheregionrepresentingthegeometricalareaofCyinthedomainδi,andφisasetofweightsindicatingtheassigneddegreesofsaliencebetweenCyandeachqualitydimensionq∈Q(δi).

Inordertorepresentaconcept,therepresentationofitssub-conceptsisde-fined.Thefollowingtwosectionsdescribethewaytoformallyrepresentacon-cept,bydefiningitsregionsanditssetofweights,respectively.Algorithm3.3showsthestepsoftheconceptrepresentation,withtherequiredparameterstorepresentaconceptCy.

3.2.1ConvexRegionsofConcepts

Theidentificationofthegeometricalregionsofconceptsisbasedontheloca-tionoftheknownobservations.TheconceptCy∈CisrepresentedusingthesubsetofobservationsDy={o1,o2,...,ony}whicharelabelledwithy∈Y.ThesetofinstancesΓ(y)isdefinedrelatedtotheobservationsinDy,denoted

6Ifaconceptisassociatedwithonlyonedomain,thentherepresentationoftheconceptisinfactequivalenttotherepresentationofitssub-conceptinthatdomain.


Algorithm 3.3: Concept Representation

Input: Dy ⊂ D: set of observations that are labelled by y ∈ Y andΔ(y) ⊆ Δ: domains that contain Cy in their concept sets.

Output: A Concept Cy = {c1y, . . . , cky}, representing label y in the

conceptual space.foreach o ∈ Dy do

γo ← Vectorise(o,Δ(y),Q) // determine p1γ, . . . ,pk

γ in δ1, ..., δkΓ(y) ← Γ(y) ∪ γo

foreach δi ∈ Δ(y) // δi = 〈Q(δi),C(δi),ωδi〉 , 1 � i � k

dociy ← ∅foreach γ ∈ Γ(y) do

Piy ← Pi

y ∪ {piγ}

η ← ConvexHull(Piy)

φ ← {ωδi(Cy,qi)|Cy ∈ C(δi),qi ∈ Q(δi)}

ciy ← 〈η,φ〉Cy ← Cy ∪ ciy

concepts, denoted: Cy = {c1y, . . . , cky}, where each ciy is the representation of

Cy within the domain δi ∈ Δ(y)6.

Definition 3.4. A sub-concept ciy, representing the concept Cy in the domainδi, is defined as a tuple 〈η,φ〉, where η is the region representing the geometricalarea of Cy in the domain δi, and φ is a set of weights indicating the assigneddegrees of salience between Cy and each quality dimension q ∈ Q(δi).

In order to represent a concept, the representation of its sub-concepts is de-fined. The following two sections describe the way to formally represent a con-cept, by defining its regions and its set of weights, respectively. Algorithm 3.3shows the steps of the concept representation, with the required parameters torepresent a concept Cy.

3.2.1 Convex Regions of Concepts

The identification of the geometrical regions of concepts is based on the loca-tion of the known observations. The concept Cy ∈ C is represented using thesubset of observations Dy = {o1,o2, . . . ,ony

} which are labelled with y ∈ Y.The set of instances Γ(y) is defined related to the observations in Dy, denoted

6 If a concept is associated with only one domain, then the representation of the concept is infact equivalent to the representation of its sub-concept in that domain.


Algorithm3.3:ConceptRepresentation

Input:Dy⊂D:setofobservationsthatarelabelledbyy∈YandΔ(y)⊆Δ:domainsthatcontainCyintheirconceptsets.

Output:AConceptCy={c1y,...,c

ky},representinglabelyinthe

conceptualspace.foreacho∈Dydo

γo←Vectorise(o,Δ(y),Q)//determinep1γ,...,p

kγinδ1,...,δk

Γ(y)←Γ(y)∪γo

foreachδi∈Δ(y)//δi=〈Q(δi),C(δi),ωδi〉,1�i�k

dociy←∅

foreachγ∈Γ(y)doPiy←P

iy∪{p

iγ}

η←ConvexHull(Piy)

φ←{ωδi(Cy,qi)|Cy∈C(δi),q

i∈Q(δi)}

ciy←〈η,φ〉Cy←Cy∪c

iy

concepts,denoted:Cy={c1y,...,c

ky},whereeachc

iyistherepresentationof

Cywithinthedomainδi∈Δ(y)6.

Definition3.4.Asub-conceptciy,representingtheconceptCyinthedomain

δi,isdefinedasatuple〈η,φ〉,whereηistheregionrepresentingthegeometricalareaofCyinthedomainδi,andφisasetofweightsindicatingtheassigneddegreesofsaliencebetweenCyandeachqualitydimensionq∈Q(δi).

Inordertorepresentaconcept,therepresentationofitssub-conceptsisde-fined.Thefollowingtwosectionsdescribethewaytoformallyrepresentacon-cept,bydefiningitsregionsanditssetofweights,respectively.Algorithm3.3showsthestepsoftheconceptrepresentation,withtherequiredparameterstorepresentaconceptCy.

3.2.1ConvexRegionsofConcepts

Theidentificationofthegeometricalregionsofconceptsisbasedontheloca-tionoftheknownobservations.TheconceptCy∈CisrepresentedusingthesubsetofobservationsDy={o1,o2,...,ony}whicharelabelledwithy∈Y.ThesetofinstancesΓ(y)isdefinedrelatedtotheobservationsinDy,denoted

6Ifaconceptisassociatedwithonlyonedomain,thentherepresentationoftheconceptisinfactequivalenttotherepresentationofitssub-conceptinthatdomain.


by Γ(y) = {γ1,γ2, . . . ,γny}. These instances then specify the geometrical repre-

sentation of the concept Cy as a set of regions within the domains. The set ofall instances Γ in a conceptual space S is defined as: Γ =

⋃y∈Y Γ(y).

Definition 3.5. An instance γ related to the concept Cy is a finite set of n-dimensional points γ = {p1

γ, . . . ,pkγ} with a one-to-one mapping from the in-

stance points to the domains Δ(y), where |Δ(y)| = |Cy| = k.

An instance γo ∈ Γ(y) is the representation of the observation o ∈ Dy. Thepoints of γo are basically the values of the associated quality dimensions, whichare stored in the feature vector xo. Formally, each point pi

γ ∈ γo in a domainδi ∈ Δ(y) is a numeric vector of the values of the quality dimensions in δi,denoted: pi

γ =< q1(γo), . . . ,q|Q(δi)|(γo) >, which is a sub-vector of the featurevector xo that includes the features associated with the quality dimensions inQ(δi). This process of determining the points of an instance γo using the featurevector of the observation o is called vectorisation.

Since all the instances with the label y have a point pi in domain δi ∈ Δ(y),to identify the convex region η of a sub-concept ciy, it is necessary to knowthe location of all these points in the domain. Let Pi

y be the collection of allthe points which their corresponding instances are labelled by y, and thesepoints are located in domain δi. So, Pi

y = {piγ1

,piγ2

, . . . ,piγny

}, where piγj

∈ γj,γj ∈ Γ(y), and j = 1...ny.

Example 3.7. Figure 3.5a consists of two domains δa and δb with two andthree quality dimensions, respectively. Assume a concept Cy has two sub-concepts cay and cby within these domains. So, each instance γj ∈ Γ(y) includestwo points pa

j and pbj located in the domains. Figure 3.5a depicts the set of

points Pay and Pb

y, which will be used to represent sub-concepts cay and cby,respectively.

The convexity, connectedness, and betweenness are the geometrical criteriarequired to define a region for a concept in the theory of conceptual spaces [86].The convexity of concepts is crucial to facilitate the learnability of conceptsthrough the instances [90]. Also, with the convexity, this approach satisfies thenotion of betweenness notion as an essential relation of observations in a sameregion. So, if two observations of a concept are located at points p1 and p2,then any instance located between p1 and p2 also belongs to the same con-cept [86]. A convex region is a geometric structure within a multidimensionaldomain which satisfies convexity and connectedness criteria. There are variousapproaches to identify the convex region covering a set of giving points, suchas convex hull and Voronoi tessellations algorithms, or defining an ellipsoidaround the points [9, 86]. For the purpose of this work, the convex hull (CH)is a more convenient choice among others, because it also satisfies the between-ness criterion. Since the concepts’ regions are formed by its instances, both


byΓ(y)={γ1,γ2,...,γny}.Theseinstancesthenspecifythegeometricalrepre-sentationoftheconceptCyasasetofregionswithinthedomains.ThesetofallinstancesΓinaconceptualspaceSisdefinedas:Γ=⋃y∈YΓ(y).

Definition3.5.AninstanceγrelatedtotheconceptCyisafinitesetofn-dimensionalpointsγ={p1

γ,...,pkγ}withaone-to-onemappingfromthein-

stancepointstothedomainsΔ(y),where|Δ(y)|=|Cy|=k.

Aninstanceγo∈Γ(y)istherepresentationoftheobservationo∈Dy.Thepointsofγoarebasicallythevaluesoftheassociatedqualitydimensions,whicharestoredinthefeaturevectorxo.Formally,eachpointp

iγ∈γoinadomain

δi∈Δ(y)isanumericvectorofthevaluesofthequalitydimensionsinδi,denoted:p

iγ=<q1(γo),...,q|Q(δi)|(γo)>,whichisasub-vectorofthefeature

vectorxothatincludesthefeaturesassociatedwiththequalitydimensionsinQ(δi).Thisprocessofdeterminingthepointsofaninstanceγousingthefeaturevectoroftheobservationoiscalledvectorisation.

Sincealltheinstanceswiththelabelyhaveapointpi

indomainδi∈Δ(y),toidentifytheconvexregionηofasub-conceptc

iy,itisnecessarytoknow

thelocationofallthesepointsinthedomain.LetPiybethecollectionofall

thepointswhichtheircorrespondinginstancesarelabelledbyy,andthesepointsarelocatedindomainδi.So,P

iy={p

iγ1,p

iγ2,...,p

iγny},wherep

iγj∈γj,

γj∈Γ(y),andj=1...ny.

Example3.7.Figure3.5aconsistsoftwodomainsδaandδbwithtwoandthreequalitydimensions,respectively.AssumeaconceptCyhastwosub-conceptsc

ayandc

bywithinthesedomains.So,eachinstanceγj∈Γ(y)includes

twopointspajandp

bjlocatedinthedomains.Figure3.5adepictsthesetof

pointsPayandP

by,whichwillbeusedtorepresentsub-conceptsc

ayandc

by,

respectively.

Theconvexity,connectedness,andbetweennessarethegeometricalcriteriarequiredtodefinearegionforaconceptinthetheoryofconceptualspaces[86].Theconvexityofconceptsiscrucialtofacilitatethelearnabilityofconceptsthroughtheinstances[90].Also,withtheconvexity,thisapproachsatisfiesthenotionofbetweennessnotionasanessentialrelationofobservationsinasameregion.So,iftwoobservationsofaconceptarelocatedatpointsp1andp2,thenanyinstancelocatedbetweenp1andp2alsobelongstothesamecon-cept[86].Aconvexregionisageometricstructurewithinamultidimensionaldomainwhichsatisfiesconvexityandconnectednesscriteria.Therearevariousapproachestoidentifytheconvexregioncoveringasetofgivingpoints,suchasconvexhullandVoronoitessellationsalgorithms,ordefininganellipsoidaroundthepoints[9,86].Forthepurposeofthiswork,theconvexhull(CH)isamoreconvenientchoiceamongothers,becauseitalsosatisfiesthebetween-nesscriterion.Sincetheconcepts’regionsareformedbyitsinstances,both


by Γ(y) = {γ1,γ2, . . . ,γny}. These instances then specify the geometrical repre-

sentation of the concept Cy as a set of regions within the domains. The set ofall instances Γ in a conceptual space S is defined as: Γ =

⋃y∈Y Γ(y).

Definition 3.5. An instance γ related to the concept Cy is a finite set of n-dimensional points γ = {p1

γ, . . . ,pkγ} with a one-to-one mapping from the in-

stance points to the domains Δ(y), where |Δ(y)| = |Cy| = k.

An instance γo ∈ Γ(y) is the representation of the observation o ∈ Dy. Thepoints of γo are basically the values of the associated quality dimensions, whichare stored in the feature vector xo. Formally, each point pi

γ ∈ γo in a domainδi ∈ Δ(y) is a numeric vector of the values of the quality dimensions in δi,denoted: pi

γ =< q1(γo), . . . ,q|Q(δi)|(γo) >, which is a sub-vector of the featurevector xo that includes the features associated with the quality dimensions inQ(δi). This process of determining the points of an instance γo using the featurevector of the observation o is called vectorisation.

Since all the instances with the label y have a point pi in domain δi ∈ Δ(y),to identify the convex region η of a sub-concept ciy, it is necessary to knowthe location of all these points in the domain. Let Pi

y be the collection of allthe points which their corresponding instances are labelled by y, and thesepoints are located in domain δi. So, Pi

y = {piγ1

,piγ2

, . . . ,piγny

}, where piγj

∈ γj,γj ∈ Γ(y), and j = 1...ny.

Example 3.7. Figure 3.5a consists of two domains δa and δb with two andthree quality dimensions, respectively. Assume a concept Cy has two sub-concepts cay and cby within these domains. So, each instance γj ∈ Γ(y) includestwo points pa

j and pbj located in the domains. Figure 3.5a depicts the set of

points Pay and Pb

y, which will be used to represent sub-concepts cay and cby,respectively.

The convexity, connectedness, and betweenness are the geometrical criteriarequired to define a region for a concept in the theory of conceptual spaces [86].The convexity of concepts is crucial to facilitate the learnability of conceptsthrough the instances [90]. Also, with the convexity, this approach satisfies thenotion of betweenness notion as an essential relation of observations in a sameregion. So, if two observations of a concept are located at points p1 and p2,then any instance located between p1 and p2 also belongs to the same con-cept [86]. A convex region is a geometric structure within a multidimensionaldomain which satisfies convexity and connectedness criteria. There are variousapproaches to identify the convex region covering a set of giving points, suchas convex hull and Voronoi tessellations algorithms, or defining an ellipsoidaround the points [9, 86]. For the purpose of this work, the convex hull (CH)is a more convenient choice among others, because it also satisfies the between-ness criterion. Since the concepts’ regions are formed by its instances, both


byΓ(y)={γ1,γ2,...,γny}.Theseinstancesthenspecifythegeometricalrepre-sentationoftheconceptCyasasetofregionswithinthedomains.ThesetofallinstancesΓinaconceptualspaceSisdefinedas:Γ=⋃y∈YΓ(y).

Definition3.5.AninstanceγrelatedtotheconceptCyisafinitesetofn-dimensionalpointsγ={p1

γ,...,pkγ}withaone-to-onemappingfromthein-

stancepointstothedomainsΔ(y),where|Δ(y)|=|Cy|=k.

Aninstanceγo∈Γ(y)istherepresentationoftheobservationo∈Dy.Thepointsofγoarebasicallythevaluesoftheassociatedqualitydimensions,whicharestoredinthefeaturevectorxo.Formally,eachpointp

iγ∈γoinadomain

δi∈Δ(y)isanumericvectorofthevaluesofthequalitydimensionsinδi,denoted:p

iγ=<q1(γo),...,q|Q(δi)|(γo)>,whichisasub-vectorofthefeature

vectorxothatincludesthefeaturesassociatedwiththequalitydimensionsinQ(δi).Thisprocessofdeterminingthepointsofaninstanceγousingthefeaturevectoroftheobservationoiscalledvectorisation.

Sincealltheinstanceswiththelabelyhaveapointpi

indomainδi∈Δ(y),toidentifytheconvexregionηofasub-conceptc

iy,itisnecessarytoknow

thelocationofallthesepointsinthedomain.LetPiybethecollectionofall

thepointswhichtheircorrespondinginstancesarelabelledbyy,andthesepointsarelocatedindomainδi.So,P

iy={p

iγ1,p

iγ2,...,p

iγny},wherep

iγj∈γj,

γj∈Γ(y),andj=1...ny.

Example3.7.Figure3.5aconsistsoftwodomainsδaandδbwithtwoandthreequalitydimensions,respectively.AssumeaconceptCyhastwosub-conceptsc

ayandc

bywithinthesedomains.So,eachinstanceγj∈Γ(y)includes

twopointspajandp

bjlocatedinthedomains.Figure3.5adepictsthesetof

pointsPayandP

by,whichwillbeusedtorepresentsub-conceptsc

ayandc

by,

respectively.

Theconvexity,connectedness,andbetweennessarethegeometricalcriteriarequiredtodefinearegionforaconceptinthetheoryofconceptualspaces[86].Theconvexityofconceptsiscrucialtofacilitatethelearnabilityofconceptsthroughtheinstances[90].Also,withtheconvexity,thisapproachsatisfiesthenotionofbetweennessnotionasanessentialrelationofobservationsinasameregion.So,iftwoobservationsofaconceptarelocatedatpointsp1andp2,thenanyinstancelocatedbetweenp1andp2alsobelongstothesamecon-cept[86].Aconvexregionisageometricstructurewithinamultidimensionaldomainwhichsatisfiesconvexityandconnectednesscriteria.Therearevariousapproachestoidentifytheconvexregioncoveringasetofgivingpoints,suchasconvexhullandVoronoitessellationsalgorithms,ordefininganellipsoidaroundthepoints[9,86].Forthepurposeofthiswork,theconvexhull(CH)isamoreconvenientchoiceamongothers,becauseitalsosatisfiesthebetween-nesscriterion.Sincetheconcepts’regionsareformedbyitsinstances,both


ellipsoid and Voronoi regions assign some points of the space to a concept’sregion, which are not necessarily between the concept’s known instances.

For a sub-concept ciy, the convex region η is defined as the convex hull (i.e.,convex polytope) of the points in Pi

y, as: η(ciy) = CH(Piy).

Example 3.8. Figure 3.5b shows the convex regions of the sub-concepts cay andcby. The convex hull η(cay) is a 2D polygon in δa, and η(cby) is a 3D polytope inδb. These convex hulls are calculated based on the points Pa

y and Pby from the

instances in Γ(y) in Figure 3.5a.

(a) Instances in Γ(y) and their corresponding set of points, Pay and Pb

y.

(b) Convex regions of the sub-concepts cay and cby.

Figure 3.5: A concept representation example in a conceptual space with do-mains δa and δb.


ellipsoidandVoronoiregionsassignsomepointsofthespacetoaconcept’sregion,whicharenotnecessarilybetweentheconcept’sknowninstances.

Forasub-conceptciy,theconvexregionηisdefinedastheconvexhull(i.e.,

convexpolytope)ofthepointsinPiy,as:η(c

iy)=CH(P

iy).

Example3.8.Figure3.5bshowstheconvexregionsofthesub-conceptscayand

cby.Theconvexhullη(c

ay)isa2Dpolygoninδa,andη(c

by)isa3Dpolytopein

δb.TheseconvexhullsarecalculatedbasedonthepointsPayandP

byfromthe

instancesinΓ(y)inFigure3.5a.

(a)InstancesinΓ(y)andtheircorrespondingsetofpoints,PayandP

by.

(b)Convexregionsofthesub-conceptscayandc

by.

Figure3.5:Aconceptrepresentationexampleinaconceptualspacewithdo-mainsδaandδb.


ellipsoid and Voronoi regions assign some points of the space to a concept’sregion, which are not necessarily between the concept’s known instances.

For a sub-concept ciy, the convex region η is defined as the convex hull (i.e.,convex polytope) of the points in Pi

y, as: η(ciy) = CH(Piy).

Example 3.8. Figure 3.5b shows the convex regions of the sub-concepts cay andcby. The convex hull η(cay) is a 2D polygon in δa, and η(cby) is a 3D polytope inδb. These convex hulls are calculated based on the points Pa

y and Pby from the

instances in Γ(y) in Figure 3.5a.

(a) Instances in Γ(y) and their corresponding set of points, Pay and Pb

y.

(b) Convex regions of the sub-concepts cay and cby.

Figure 3.5: A concept representation example in a conceptual space with do-mains δa and δb.


ellipsoidandVoronoiregionsassignsomepointsofthespacetoaconcept’sregion,whicharenotnecessarilybetweentheconcept’sknowninstances.

Forasub-conceptciy,theconvexregionηisdefinedastheconvexhull(i.e.,

convexpolytope)ofthepointsinPiy,as:η(c

iy)=CH(P

iy).

Example3.8.Figure3.5bshowstheconvexregionsofthesub-conceptscayand

cby.Theconvexhullη(c

ay)isa2Dpolygoninδa,andη(c

by)isa3Dpolytopein

δb.TheseconvexhullsarecalculatedbasedonthepointsPayandP

byfromthe

instancesinΓ(y)inFigure3.5a.

(a)InstancesinΓ(y)andtheircorrespondingsetofpoints,PayandP

by.

(b)Convexregionsofthesub-conceptscayandc

by.

Figure3.5:Aconceptrepresentationexampleinaconceptualspacewithdo-mainsδaandδb.

3.3. DISCUSSION 49

3.2.2 Context-dependent Weights of Concepts

Depending on the context, the salience given to various aspects of a conceptmay vary [86]. In the example of the apple concept, in one context the tastedomain might be more prominent, but in another context, shape domain canbe salient. In contrast, in such examples of concepts that there is no commonknowledge about the salience of the domains in various concepts, the data it-self determines the salience of domains and quality dimensions and defines thecontext-based weights for the concepts. In other words, the observations fromdifferent contexts define which domain and quality dimensions are more im-portant to represent the given concepts.

Example 3.9. For the example of leaf data set, suppose that shape and colourare the domains, and suppose that one wants to differ between the contextsof Swedish leaves and Japanese leaves. Knowing the common-sense knowledgeabout these contexts might be useless to determine the weights of the domainsand dimensions. However, based on the observed data in each of these contexts,one can realise that e.g., the quality dimensions in the shape domain are moresalient rather than the colour domain to represent the Swedish leaves, but thisinference may not be necessarily valid for Japanese leaves.

These relative degrees of salience assigned to the dimensions of the domainsimplicitly represent the notion of context. Here, the context-dependent weightsare already embedded in the representation by calculating the relevance of qual-ity dimensions to the concepts (i.e., the weights of the bipartite graph) whilespecifying the domains. The salient weights φ for a sub-concept ciy come fromthe assigned weights ωδi

in δi between Cy ∈ C(δi) and the any quality dimen-sion in Q(δi). Formally:

φ(ciy) = {ωδi(Cy,qi) | Cy ∈ C(δi), qi ∈ Q(δi)}. (3.6)

So, each sub-concept has its own set of weights in relation to the do-main’s quality dimensions. This point individualises the definition of context-dependent weights from the definition of context weights in other developedconceptual spaces. In other conceptual spaces, a set of overall weights is as-signed to the domains without considering the role. But here, for two indepen-dent sub-concepts, the assigned weights in the same domain might vary.

3.3 Discussion

This chapter has introduced a data-driven construction of conceptual spacesfrom known observations in a given numeric data set. The framework proposedhere has employed machine learning algorithms for the task of identifying rel-evant features and concepts in a numerical data set, to shape the domains andquality dimensions of a conceptual space. It has been argued that a set of se-lected and grouped features that provide discrimination between concepts are

3.3.DISCUSSION49

3.2.2Context-dependentWeightsofConcepts

Dependingonthecontext,thesaliencegiventovariousaspectsofaconceptmayvary[86].Intheexampleoftheappleconcept,inonecontextthetastedomainmightbemoreprominent,butinanothercontext,shapedomaincanbesalient.Incontrast,insuchexamplesofconceptsthatthereisnocommonknowledgeaboutthesalienceofthedomainsinvariousconcepts,thedatait-selfdeterminesthesalienceofdomainsandqualitydimensionsanddefinesthecontext-basedweightsfortheconcepts.Inotherwords,theobservationsfromdifferentcontextsdefinewhichdomainandqualitydimensionsaremoreim-portanttorepresentthegivenconcepts.

Example3.9.Fortheexampleofleafdataset,supposethatshapeandcolourarethedomains,andsupposethatonewantstodifferbetweenthecontextsofSwedishleavesandJapaneseleaves.Knowingthecommon-senseknowledgeaboutthesecontextsmightbeuselesstodeterminetheweightsofthedomainsanddimensions.However,basedontheobserveddataineachofthesecontexts,onecanrealisethate.g.,thequalitydimensionsintheshapedomainaremoresalientratherthanthecolourdomaintorepresenttheSwedishleaves,butthisinferencemaynotbenecessarilyvalidforJapaneseleaves.

Theserelativedegreesofsalienceassignedtothedimensionsofthedomainsimplicitlyrepresentthenotionofcontext.Here,thecontext-dependentweightsarealreadyembeddedintherepresentationbycalculatingtherelevanceofqual-itydimensionstotheconcepts(i.e.,theweightsofthebipartitegraph)whilespecifyingthedomains.Thesalientweightsφforasub-conceptc

iycomefrom

theassignedweightsωδiinδibetweenCy∈C(δi)andtheanyqualitydimen-sioninQ(δi).Formally:

φ(ciy)={ωδi(Cy,q

i)|Cy∈C(δi),q

i∈Q(δi)}.(3.6)

So,eachsub-concepthasitsownsetofweightsinrelationtothedo-main’squalitydimensions.Thispointindividualisesthedefinitionofcontext-dependentweightsfromthedefinitionofcontextweightsinotherdevelopedconceptualspaces.Inotherconceptualspaces,asetofoverallweightsisas-signedtothedomainswithoutconsideringtherole.Buthere,fortwoindepen-dentsub-concepts,theassignedweightsinthesamedomainmightvary.

3.3Discussion

Thischapterhasintroducedadata-drivenconstructionofconceptualspacesfromknownobservationsinagivennumericdataset.Theframeworkproposedherehasemployedmachinelearningalgorithmsforthetaskofidentifyingrel-evantfeaturesandconceptsinanumericaldataset,toshapethedomainsandqualitydimensionsofaconceptualspace.Ithasbeenarguedthatasetofse-lectedandgroupedfeaturesthatprovidediscriminationbetweenconceptsare

3.3. DISCUSSION 49

3.2.2 Context-dependent Weights of Concepts

Depending on the context, the salience given to various aspects of a conceptmay vary [86]. In the example of the apple concept, in one context the tastedomain might be more prominent, but in another context, shape domain canbe salient. In contrast, in such examples of concepts that there is no commonknowledge about the salience of the domains in various concepts, the data it-self determines the salience of domains and quality dimensions and defines thecontext-based weights for the concepts. In other words, the observations fromdifferent contexts define which domain and quality dimensions are more im-portant to represent the given concepts.

Example 3.9. For the example of leaf data set, suppose that shape and colourare the domains, and suppose that one wants to differ between the contextsof Swedish leaves and Japanese leaves. Knowing the common-sense knowledgeabout these contexts might be useless to determine the weights of the domainsand dimensions. However, based on the observed data in each of these contexts,one can realise that e.g., the quality dimensions in the shape domain are moresalient rather than the colour domain to represent the Swedish leaves, but thisinference may not be necessarily valid for Japanese leaves.

These relative degrees of salience assigned to the dimensions of the domainsimplicitly represent the notion of context. Here, the context-dependent weightsare already embedded in the representation by calculating the relevance of qual-ity dimensions to the concepts (i.e., the weights of the bipartite graph) whilespecifying the domains. The salient weights φ for a sub-concept ciy come fromthe assigned weights ωδi

in δi between Cy ∈ C(δi) and the any quality dimen-sion in Q(δi). Formally:

φ(ciy) = {ωδi(Cy,qi) | Cy ∈ C(δi), qi ∈ Q(δi)}. (3.6)

So, each sub-concept has its own set of weights in relation to the do-main’s quality dimensions. This point individualises the definition of context-dependent weights from the definition of context weights in other developedconceptual spaces. In other conceptual spaces, a set of overall weights is as-signed to the domains without considering the role. But here, for two indepen-dent sub-concepts, the assigned weights in the same domain might vary.

3.3 Discussion

This chapter has introduced a data-driven construction of conceptual spacesfrom known observations in a given numeric data set. The framework proposedhere has employed machine learning algorithms for the task of identifying rel-evant features and concepts in a numerical data set, to shape the domains andquality dimensions of a conceptual space. It has been argued that a set of se-lected and grouped features that provide discrimination between concepts are

3.3.DISCUSSION49

3.2.2Context-dependentWeightsofConcepts

Dependingonthecontext,thesaliencegiventovariousaspectsofaconceptmayvary[86].Intheexampleoftheappleconcept,inonecontextthetastedomainmightbemoreprominent,butinanothercontext,shapedomaincanbesalient.Incontrast,insuchexamplesofconceptsthatthereisnocommonknowledgeaboutthesalienceofthedomainsinvariousconcepts,thedatait-selfdeterminesthesalienceofdomainsandqualitydimensionsanddefinesthecontext-basedweightsfortheconcepts.Inotherwords,theobservationsfromdifferentcontextsdefinewhichdomainandqualitydimensionsaremoreim-portanttorepresentthegivenconcepts.

Example3.9.Fortheexampleofleafdataset,supposethatshapeandcolourarethedomains,andsupposethatonewantstodifferbetweenthecontextsofSwedishleavesandJapaneseleaves.Knowingthecommon-senseknowledgeaboutthesecontextsmightbeuselesstodeterminetheweightsofthedomainsanddimensions.However,basedontheobserveddataineachofthesecontexts,onecanrealisethate.g.,thequalitydimensionsintheshapedomainaremoresalientratherthanthecolourdomaintorepresenttheSwedishleaves,butthisinferencemaynotbenecessarilyvalidforJapaneseleaves.

Theserelativedegreesofsalienceassignedtothedimensionsofthedomainsimplicitlyrepresentthenotionofcontext.Here,thecontext-dependentweightsarealreadyembeddedintherepresentationbycalculatingtherelevanceofqual-itydimensionstotheconcepts(i.e.,theweightsofthebipartitegraph)whilespecifyingthedomains.Thesalientweightsφforasub-conceptc

iycomefrom

theassignedweightsωδiinδibetweenCy∈C(δi)andtheanyqualitydimen-sioninQ(δi).Formally:

φ(ciy)={ωδi(Cy,q

i)|Cy∈C(δi),q

i∈Q(δi)}.(3.6)

So,eachsub-concepthasitsownsetofweightsinrelationtothedo-main’squalitydimensions.Thispointindividualisesthedefinitionofcontext-dependentweightsfromthedefinitionofcontextweightsinotherdevelopedconceptualspaces.Inotherconceptualspaces,asetofoverallweightsisas-signedtothedomainswithoutconsideringtherole.Buthere,fortwoindepen-dentsub-concepts,theassignedweightsinthesamedomainmightvary.

3.3Discussion

Thischapterhasintroducedadata-drivenconstructionofconceptualspacesfromknownobservationsinagivennumericdataset.Theframeworkproposedherehasemployedmachinelearningalgorithmsforthetaskofidentifyingrel-evantfeaturesandconceptsinanumericaldataset,toshapethedomainsandqualitydimensionsofaconceptualspace.Ithasbeenarguedthatasetofse-lectedandgroupedfeaturesthatprovidediscriminationbetweenconceptsare


adequate to specify the domains and dimensions while preserving the semanticinterpretation of the features and concepts. Then, an instance-based approachhas been shown for the task of concept formation within the derived concep-tual space. A key finding from the data-driven construction of conceptual spaceis that it provides a generalisation for concept representation, where the modelcan be constructed or extend by different types of input instances.

This work has demonstrated how to generate a model with which it is pos-sible to create semantic interpretations of new observations. Keßler [131] statesthat any data-driven approach to generating conceptual spaces cannot be fullyautomated and require at some point external (symbolic) information. Follow-ing his statement, the presented approach in this thesis also relies to some degreeon pre-defined concepts related to the input data set, which make it a supervisedprocess. However, the processes of domain/quality dimension specification andconcept formation perform automatically based on the data. In other words,performing these processes to create conceptual spaces is not dependent on thesymbolic information of the provided knowledge about concepts and features.

In the following, some issues inherent to the data-driven approach are dis-cussed.

Interpretation of features: One important issue to address is to determine howinterpretable the selected features are for representing the derived concepts. In-herently, the quality dimensions capture the attributes that can cognitively cat-egorise the concepts [92]. Thus, it makes sense that the feature selection con-siders the separability between concepts when generating conceptual spaces ina data-driven manner. At the same time, a data-driven approach cannot be sep-arated entirely from meaningful semantics. Hence, a further implicit selectioncriterion has been to select features that can be expressed in natural language,or as stated by [86], features which can be given a meaningful perceptual in-terpretation. The interpretability of features is certainly context-dependent andoccurs at different levels of feature abstractness [214]. As an example, for agiven leaf sample, a large or small area of a leaf is not informative whereasknowing the elongation or wideness enables the model to depict a more mean-ingful description.

Semantics of domains: Another issue is how to form the domains in a con-ceptual space without human perception (Section 3.1). While the quality di-mensions can be mapped to the feature selection methods, the domains whichare formed by grouping the features should be semantically meaningful. In thisapproach, grouping the features is based on how well a subset of the featuresdistinctly represents the various concepts. However, there still exists the prob-lem of verifying the semantic dependency of the quality dimensions within adomain, to realise what quality dimensions are integral and what are separa-ble without necessarily involving background knowledge. While no solution is


adequatetospecifythedomainsanddimensionswhilepreservingthesemanticinterpretationofthefeaturesandconcepts.Then,aninstance-basedapproachhasbeenshownforthetaskofconceptformationwithinthederivedconcep-tualspace.Akeyfindingfromthedata-drivenconstructionofconceptualspaceisthatitprovidesageneralisationforconceptrepresentation,wherethemodelcanbeconstructedorextendbydifferenttypesofinputinstances.

Thisworkhasdemonstratedhowtogenerateamodelwithwhichitispos-sibletocreatesemanticinterpretationsofnewobservations.Keßler[131]statesthatanydata-drivenapproachtogeneratingconceptualspacescannotbefullyautomatedandrequireatsomepointexternal(symbolic)information.Follow-inghisstatement,thepresentedapproachinthisthesisalsoreliestosomedegreeonpre-definedconceptsrelatedtotheinputdataset,whichmakeitasupervisedprocess.However,theprocessesofdomain/qualitydimensionspecificationandconceptformationperformautomaticallybasedonthedata.Inotherwords,performingtheseprocessestocreateconceptualspacesisnotdependentonthesymbolicinformationoftheprovidedknowledgeaboutconceptsandfeatures.

Inthefollowing,someissuesinherenttothedata-drivenapproacharedis-cussed.

Interpretationoffeatures:Oneimportantissuetoaddressistodeterminehowinterpretabletheselectedfeaturesareforrepresentingthederivedconcepts.In-herently,thequalitydimensionscapturetheattributesthatcancognitivelycat-egorisetheconcepts[92].Thus,itmakessensethatthefeatureselectioncon-siderstheseparabilitybetweenconceptswhengeneratingconceptualspacesinadata-drivenmanner.Atthesametime,adata-drivenapproachcannotbesep-aratedentirelyfrommeaningfulsemantics.Hence,afurtherimplicitselectioncriterionhasbeentoselectfeaturesthatcanbeexpressedinnaturallanguage,orasstatedby[86],featureswhichcanbegivenameaningfulperceptualin-terpretation.Theinterpretabilityoffeaturesiscertainlycontext-dependentandoccursatdifferentlevelsoffeatureabstractness[214].Asanexample,foragivenleafsample,alargeorsmallareaofaleafisnotinformativewhereasknowingtheelongationorwidenessenablesthemodeltodepictamoremean-ingfuldescription.

Semanticsofdomains:Anotherissueishowtoformthedomainsinacon-ceptualspacewithouthumanperception(Section3.1).Whilethequalitydi-mensionscanbemappedtothefeatureselectionmethods,thedomainswhichareformedbygroupingthefeaturesshouldbesemanticallymeaningful.Inthisapproach,groupingthefeaturesisbasedonhowwellasubsetofthefeaturesdistinctlyrepresentsthevariousconcepts.However,therestillexiststheprob-lemofverifyingthesemanticdependencyofthequalitydimensionswithinadomain,torealisewhatqualitydimensionsareintegralandwhataresepara-blewithoutnecessarilyinvolvingbackgroundknowledge.Whilenosolutionis


adequate to specify the domains and dimensions while preserving the semanticinterpretation of the features and concepts. Then, an instance-based approachhas been shown for the task of concept formation within the derived concep-tual space. A key finding from the data-driven construction of conceptual spaceis that it provides a generalisation for concept representation, where the modelcan be constructed or extend by different types of input instances.

This work has demonstrated how to generate a model with which it is pos-sible to create semantic interpretations of new observations. Keßler [131] statesthat any data-driven approach to generating conceptual spaces cannot be fullyautomated and require at some point external (symbolic) information. Follow-ing his statement, the presented approach in this thesis also relies to some degreeon pre-defined concepts related to the input data set, which make it a supervisedprocess. However, the processes of domain/quality dimension specification andconcept formation perform automatically based on the data. In other words,performing these processes to create conceptual spaces is not dependent on thesymbolic information of the provided knowledge about concepts and features.

In the following, some issues inherent to the data-driven approach are dis-cussed.

Interpretation of features: One important issue to address is to determine howinterpretable the selected features are for representing the derived concepts. In-herently, the quality dimensions capture the attributes that can cognitively cat-egorise the concepts [92]. Thus, it makes sense that the feature selection con-siders the separability between concepts when generating conceptual spaces ina data-driven manner. At the same time, a data-driven approach cannot be sep-arated entirely from meaningful semantics. Hence, a further implicit selectioncriterion has been to select features that can be expressed in natural language,or as stated by [86], features which can be given a meaningful perceptual in-terpretation. The interpretability of features is certainly context-dependent andoccurs at different levels of feature abstractness [214]. As an example, for agiven leaf sample, a large or small area of a leaf is not informative whereasknowing the elongation or wideness enables the model to depict a more mean-ingful description.

Semantics of domains: Another issue is how to form the domains in a con-ceptual space without human perception (Section 3.1). While the quality di-mensions can be mapped to the feature selection methods, the domains whichare formed by grouping the features should be semantically meaningful. In thisapproach, grouping the features is based on how well a subset of the featuresdistinctly represents the various concepts. However, there still exists the prob-lem of verifying the semantic dependency of the quality dimensions within adomain, to realise what quality dimensions are integral and what are separa-ble without necessarily involving background knowledge. While no solution is


adequatetospecifythedomainsanddimensionswhilepreservingthesemanticinterpretationofthefeaturesandconcepts.Then,aninstance-basedapproachhasbeenshownforthetaskofconceptformationwithinthederivedconcep-tualspace.Akeyfindingfromthedata-drivenconstructionofconceptualspaceisthatitprovidesageneralisationforconceptrepresentation,wherethemodelcanbeconstructedorextendbydifferenttypesofinputinstances.

Thisworkhasdemonstratedhowtogenerateamodelwithwhichitispos-sibletocreatesemanticinterpretationsofnewobservations.Keßler[131]statesthatanydata-drivenapproachtogeneratingconceptualspacescannotbefullyautomatedandrequireatsomepointexternal(symbolic)information.Follow-inghisstatement,thepresentedapproachinthisthesisalsoreliestosomedegreeonpre-definedconceptsrelatedtotheinputdataset,whichmakeitasupervisedprocess.However,theprocessesofdomain/qualitydimensionspecificationandconceptformationperformautomaticallybasedonthedata.Inotherwords,performingtheseprocessestocreateconceptualspacesisnotdependentonthesymbolicinformationoftheprovidedknowledgeaboutconceptsandfeatures.

Inthefollowing,someissuesinherenttothedata-drivenapproacharedis-cussed.

Interpretationoffeatures:Oneimportantissuetoaddressistodeterminehowinterpretabletheselectedfeaturesareforrepresentingthederivedconcepts.In-herently,thequalitydimensionscapturetheattributesthatcancognitivelycat-egorisetheconcepts[92].Thus,itmakessensethatthefeatureselectioncon-siderstheseparabilitybetweenconceptswhengeneratingconceptualspacesinadata-drivenmanner.Atthesametime,adata-drivenapproachcannotbesep-aratedentirelyfrommeaningfulsemantics.Hence,afurtherimplicitselectioncriterionhasbeentoselectfeaturesthatcanbeexpressedinnaturallanguage,orasstatedby[86],featureswhichcanbegivenameaningfulperceptualin-terpretation.Theinterpretabilityoffeaturesiscertainlycontext-dependentandoccursatdifferentlevelsoffeatureabstractness[214].Asanexample,foragivenleafsample,alargeorsmallareaofaleafisnotinformativewhereasknowingtheelongationorwidenessenablesthemodeltodepictamoremean-ingfuldescription.

Semanticsofdomains:Anotherissueishowtoformthedomainsinacon-ceptualspacewithouthumanperception(Section3.1).Whilethequalitydi-mensionscanbemappedtothefeatureselectionmethods,thedomainswhichareformedbygroupingthefeaturesshouldbesemanticallymeaningful.Inthisapproach,groupingthefeaturesisbasedonhowwellasubsetofthefeaturesdistinctlyrepresentsthevariousconcepts.However,therestillexiststheprob-lemofverifyingthesemanticdependencyofthequalitydimensionswithinadomain,torealisewhatqualitydimensionsareintegralandwhataresepara-blewithoutnecessarilyinvolvingbackgroundknowledge.Whilenosolutionis

3.3. DISCUSSION 51

presented to this problem by this study, the problem itself has been discussedin the literature. For example, Gärdenfors in [86] suggests that the verificationof deciding whether two quality dimensions are integral or not can be doneby empirical testing based on the subject judgements as of the domain experts,and not necessarily using statistical or analytical techniques. It is seeminglydifficult, if not impossible, to realise the semantic dependency of the featuresanalytically. For example, by looking at the values of RGB as the dimensionsof the colour domain for a set of observations, there is no indication to realisetheir semantic relations. At the very least in the proposed approach, with re-lying on the associations between the observations and features, the methodconsiders quality dimensions to be integral if they have high relevance to eachother that measured by their high relevance to the concepts. Indeed, solvingthe issue of how to derive a grouping of features for domain specification, canlead to forming a general solution to the problem of determining an evaluationcriterion to choose between competing conceptual spaces, an issue raised byGärdenfors in [88].

Quality and quantity of known instances: On concept representation (Sec-tion 3.2), convex regions and the salient weights are induced in a data-drivenway by considering the observations as the instances of the concepts. Still, theserepresentations suppose that there are sufficient known instances that are repre-sentative of the concepts to determine geometric regions and the weights. Thisis a crucial point which is inherent in all data-driven methods. In the litera-ture of conceptual spaces, the knowledge-driven approaches have used simplethresholds or cutoffs to determine the regions, as illustrated in the definition ofthe regions for the mountain and hill concepts described by [8]. However, insuch systems like the leaf data set, it is not trivial to initialise in advance thespecific geometric boundaries for the categories of the leaf species as the con-cepts, or the precise salient weights the provided domains. So, in a data-drivenapproach it is essential to have an adequate set of the known instances of theconcepts, and thus an area that is worth to study is to determine on how todynamically adjust a conceptual space when a set of new observations emerge.

In sum, this chapter has explained the process of deriving the elements ofa conceptual space in a data-driven manner in order to represent concepts.This data-driven conceptual space will then be utilised for the task of semanticinference, which is explained in Chapter 4.

3.3.DISCUSSION51

presentedtothisproblembythisstudy,theproblemitselfhasbeendiscussedintheliterature.Forexample,Gärdenforsin[86]suggeststhattheverificationofdecidingwhethertwoqualitydimensionsareintegralornotcanbedonebyempiricaltestingbasedonthesubjectjudgementsasofthedomainexperts,andnotnecessarilyusingstatisticaloranalyticaltechniques.Itisseeminglydifficult,ifnotimpossible,torealisethesemanticdependencyofthefeaturesanalytically.Forexample,bylookingatthevaluesofRGBasthedimensionsofthecolourdomainforasetofobservations,thereisnoindicationtorealisetheirsemanticrelations.Attheveryleastintheproposedapproach,withre-lyingontheassociationsbetweentheobservationsandfeatures,themethodconsidersqualitydimensionstobeintegraliftheyhavehighrelevancetoeachotherthatmeasuredbytheirhighrelevancetotheconcepts.Indeed,solvingtheissueofhowtoderiveagroupingoffeaturesfordomainspecification,canleadtoformingageneralsolutiontotheproblemofdetermininganevaluationcriteriontochoosebetweencompetingconceptualspaces,anissueraisedbyGärdenforsin[88].

Qualityandquantityofknowninstances:Onconceptrepresentation(Sec-tion3.2),convexregionsandthesalientweightsareinducedinadata-drivenwaybyconsideringtheobservationsastheinstancesoftheconcepts.Still,theserepresentationssupposethattherearesufficientknowninstancesthatarerepre-sentativeoftheconceptstodeterminegeometricregionsandtheweights.Thisisacrucialpointwhichisinherentinalldata-drivenmethods.Inthelitera-tureofconceptualspaces,theknowledge-drivenapproacheshaveusedsimplethresholdsorcutoffstodeterminetheregions,asillustratedinthedefinitionoftheregionsforthemountainandhillconceptsdescribedby[8].However,insuchsystemsliketheleafdataset,itisnottrivialtoinitialiseinadvancethespecificgeometricboundariesforthecategoriesoftheleafspeciesasthecon-cepts,ortheprecisesalientweightstheprovideddomains.So,inadata-drivenapproachitisessentialtohaveanadequatesetoftheknowninstancesoftheconcepts,andthusanareathatisworthtostudyistodetermineonhowtodynamicallyadjustaconceptualspacewhenasetofnewobservationsemerge.

Insum,thischapterhasexplainedtheprocessofderivingtheelementsofaconceptualspaceinadata-drivenmannerinordertorepresentconcepts.Thisdata-drivenconceptualspacewillthenbeutilisedforthetaskofsemanticinference,whichisexplainedinChapter4.

3.3. DISCUSSION 51

presented to this problem by this study, the problem itself has been discussedin the literature. For example, Gärdenfors in [86] suggests that the verificationof deciding whether two quality dimensions are integral or not can be doneby empirical testing based on the subject judgements as of the domain experts,and not necessarily using statistical or analytical techniques. It is seeminglydifficult, if not impossible, to realise the semantic dependency of the featuresanalytically. For example, by looking at the values of RGB as the dimensionsof the colour domain for a set of observations, there is no indication to realisetheir semantic relations. At the very least in the proposed approach, with re-lying on the associations between the observations and features, the methodconsiders quality dimensions to be integral if they have high relevance to eachother that measured by their high relevance to the concepts. Indeed, solvingthe issue of how to derive a grouping of features for domain specification, canlead to forming a general solution to the problem of determining an evaluationcriterion to choose between competing conceptual spaces, an issue raised byGärdenfors in [88].

Quality and quantity of known instances: On concept representation (Sec-tion 3.2), convex regions and the salient weights are induced in a data-drivenway by considering the observations as the instances of the concepts. Still, theserepresentations suppose that there are sufficient known instances that are repre-sentative of the concepts to determine geometric regions and the weights. Thisis a crucial point which is inherent in all data-driven methods. In the litera-ture of conceptual spaces, the knowledge-driven approaches have used simplethresholds or cutoffs to determine the regions, as illustrated in the definition ofthe regions for the mountain and hill concepts described by [8]. However, insuch systems like the leaf data set, it is not trivial to initialise in advance thespecific geometric boundaries for the categories of the leaf species as the con-cepts, or the precise salient weights the provided domains. So, in a data-drivenapproach it is essential to have an adequate set of the known instances of theconcepts, and thus an area that is worth to study is to determine on how todynamically adjust a conceptual space when a set of new observations emerge.

In sum, this chapter has explained the process of deriving the elements ofa conceptual space in a data-driven manner in order to represent concepts.This data-driven conceptual space will then be utilised for the task of semanticinference, which is explained in Chapter 4.

3.3.DISCUSSION51

presentedtothisproblembythisstudy,theproblemitselfhasbeendiscussedintheliterature.Forexample,Gärdenforsin[86]suggeststhattheverificationofdecidingwhethertwoqualitydimensionsareintegralornotcanbedonebyempiricaltestingbasedonthesubjectjudgementsasofthedomainexperts,andnotnecessarilyusingstatisticaloranalyticaltechniques.Itisseeminglydifficult,ifnotimpossible,torealisethesemanticdependencyofthefeaturesanalytically.Forexample,bylookingatthevaluesofRGBasthedimensionsofthecolourdomainforasetofobservations,thereisnoindicationtorealisetheirsemanticrelations.Attheveryleastintheproposedapproach,withre-lyingontheassociationsbetweentheobservationsandfeatures,themethodconsidersqualitydimensionstobeintegraliftheyhavehighrelevancetoeachotherthatmeasuredbytheirhighrelevancetotheconcepts.Indeed,solvingtheissueofhowtoderiveagroupingoffeaturesfordomainspecification,canleadtoformingageneralsolutiontotheproblemofdetermininganevaluationcriteriontochoosebetweencompetingconceptualspaces,anissueraisedbyGärdenforsin[88].

Qualityandquantityofknowninstances:Onconceptrepresentation(Sec-tion3.2),convexregionsandthesalientweightsareinducedinadata-drivenwaybyconsideringtheobservationsastheinstancesoftheconcepts.Still,theserepresentationssupposethattherearesufficientknowninstancesthatarerepre-sentativeoftheconceptstodeterminegeometricregionsandtheweights.Thisisacrucialpointwhichisinherentinalldata-drivenmethods.Inthelitera-tureofconceptualspaces,theknowledge-drivenapproacheshaveusedsimplethresholdsorcutoffstodeterminetheregions,asillustratedinthedefinitionoftheregionsforthemountainandhillconceptsdescribedby[8].However,insuchsystemsliketheleafdataset,itisnottrivialtoinitialiseinadvancethespecificgeometricboundariesforthecategoriesoftheleafspeciesasthecon-cepts,ortheprecisesalientweightstheprovideddomains.So,inadata-drivenapproachitisessentialtohaveanadequatesetoftheknowninstancesoftheconcepts,andthusanareathatisworthtostudyistodetermineonhowtodynamicallyadjustaconceptualspacewhenasetofnewobservationsemerge.

Insum,thischapterhasexplainedtheprocessofderivingtheelementsofaconceptualspaceinadata-drivenmannerinordertorepresentconcepts.Thisdata-drivenconceptualspacewillthenbeutilisedforthetaskofsemanticinference,whichisexplainedinChapter4.

Chapter 4

Semantic Inference in

Conceptual Spaces

“There are no facts, only interpretations.”— Friedrich Nietzsche (1844–1900)

The aim of this chapter is to design an approach for inference in order toprovide a semantic characterisation for unknown observations. The focus

of this chapter is on solving two central questions (1) how a new (unknown)observation is represented in a conceptual space, and (2) how this representa-tion enables the inference of semantic descriptions for the observation.

The first question refers to the problem of induction in conceptual spacestheory [88]. To develop such inductions in a conceptual space, it is important torealise which concepts represent a new observed instance. Due to the geometri-cal representation of the conceptual space, the similarity between the instancesin the space enables the model to define the notion of inclusion as an operatorto measure the similarity of new observations to the specified concepts withinmetric domains.

The second question refers to the problem of symbol grounding in con-ceptual spaces theory [14]. The inference of a semantic representation for anyinput observation in natural language is enabled by defining a symbol space inthe conceptual space. With the use of the symbol space, the inference can bedone by semantic reasoning in which new observations are assigned to a set oflinguistic symbols in the symbol space.

This chapter first introduces the notion of a symbol space. Then, it proposesa process for semantic inference to provide linguistic descriptions for the newobservations based on the symbol space. In general, the proposed inference ina conceptual space consists of the following steps:

53

Chapter4

SemanticInferencein

ConceptualSpaces

“Therearenofacts,onlyinterpretations.”—FriedrichNietzsche(1844–1900)

Theaimofthischapteristodesignanapproachforinferenceinordertoprovideasemanticcharacterisationforunknownobservations.Thefocus

ofthischapterisonsolvingtwocentralquestions(1)howanew(unknown)observationisrepresentedinaconceptualspace,and(2)howthisrepresenta-tionenablestheinferenceofsemanticdescriptionsfortheobservation.

Thefirstquestionreferstotheproblemofinductioninconceptualspacestheory[88].Todevelopsuchinductionsinaconceptualspace,itisimportanttorealisewhichconceptsrepresentanewobservedinstance.Duetothegeometri-calrepresentationoftheconceptualspace,thesimilaritybetweentheinstancesinthespaceenablesthemodeltodefinethenotionofinclusionasanoperatortomeasurethesimilarityofnewobservationstothespecifiedconceptswithinmetricdomains.

Thesecondquestionreferstotheproblemofsymbolgroundingincon-ceptualspacestheory[14].Theinferenceofasemanticrepresentationforanyinputobservationinnaturallanguageisenabledbydefiningasymbolspaceintheconceptualspace.Withtheuseofthesymbolspace,theinferencecanbedonebysemanticreasoninginwhichnewobservationsareassignedtoasetoflinguisticsymbolsinthesymbolspace.

Thischapterfirstintroducesthenotionofasymbolspace.Then,itproposesaprocessforsemanticinferencetoprovidelinguisticdescriptionsforthenewobservationsbasedonthesymbolspace.Ingeneral,theproposedinferenceinaconceptualspaceconsistsofthefollowingsteps:

53

Chapter 4

Semantic Inference in

Conceptual Spaces

“There are no facts, only interpretations.”— Friedrich Nietzsche (1844–1900)

The aim of this chapter is to design an approach for inference in order toprovide a semantic characterisation for unknown observations. The focus

of this chapter is on solving two central questions (1) how a new (unknown)observation is represented in a conceptual space, and (2) how this representa-tion enables the inference of semantic descriptions for the observation.

The first question refers to the problem of induction in conceptual spacestheory [88]. To develop such inductions in a conceptual space, it is important torealise which concepts represent a new observed instance. Due to the geometri-cal representation of the conceptual space, the similarity between the instancesin the space enables the model to define the notion of inclusion as an operatorto measure the similarity of new observations to the specified concepts withinmetric domains.

The second question refers to the problem of symbol grounding in con-ceptual spaces theory [14]. The inference of a semantic representation for anyinput observation in natural language is enabled by defining a symbol space inthe conceptual space. With the use of the symbol space, the inference can bedone by semantic reasoning in which new observations are assigned to a set oflinguistic symbols in the symbol space.

This chapter first introduces the notion of a symbol space. Then, it proposesa process for semantic inference to provide linguistic descriptions for the newobservations based on the symbol space. In general, the proposed inference ina conceptual space consists of the following steps:

53

Chapter4

SemanticInferencein

ConceptualSpaces

“Therearenofacts,onlyinterpretations.”—FriedrichNietzsche(1844–1900)

Theaimofthischapteristodesignanapproachforinferenceinordertoprovideasemanticcharacterisationforunknownobservations.Thefocus

ofthischapterisonsolvingtwocentralquestions(1)howanew(unknown)observationisrepresentedinaconceptualspace,and(2)howthisrepresenta-tionenablestheinferenceofsemanticdescriptionsfortheobservation.

Thefirstquestionreferstotheproblemofinductioninconceptualspacestheory[88].Todevelopsuchinductionsinaconceptualspace,itisimportanttorealisewhichconceptsrepresentanewobservedinstance.Duetothegeometri-calrepresentationoftheconceptualspace,thesimilaritybetweentheinstancesinthespaceenablesthemodeltodefinethenotionofinclusionasanoperatortomeasurethesimilarityofnewobservationstothespecifiedconceptswithinmetricdomains.

Thesecondquestionreferstotheproblemofsymbolgroundingincon-ceptualspacestheory[14].Theinferenceofasemanticrepresentationforanyinputobservationinnaturallanguageisenabledbydefiningasymbolspaceintheconceptualspace.Withtheuseofthesymbolspace,theinferencecanbedonebysemanticreasoninginwhichnewobservationsareassignedtoasetoflinguisticsymbolsinthesymbolspace.

Thischapterfirstintroducesthenotionofasymbolspace.Then,itproposesaprocessforsemanticinferencetoprovidelinguisticdescriptionsforthenewobservationsbasedonthesymbolspace.Ingeneral,theproposedinferenceinaconceptualspaceconsistsofthefollowingsteps:

53

54 CHAPTER 4. SEMANTIC INFERENCE

Figure 4.1: Illustration of the steps of inferring linguistic descriptions for an un-known observation via the constructed conceptual spaces and its correspond-ing symbol space. The details of the semantic inference step is explained inSection 4.2.

• Defining the symbol space, based on the prior knowledge for linguisticcharacterisation of the concepts and quality dimensions,

• Inferring linguistic descriptions, for each new unknown observation,based on the inclusion of its corresponding instance in the concepts:

– Inference in conceptual space: specifying the geometrical location ofa new instance within the conceptual space, examining the inclusionof the instance, and determining the linguistic labels in the symbolspace based on the associated concepts and dimensions,

– Inference in the symbol space: annotating and characterising the in-stance from the provided set of symbolic terms, and generating lin-guistic descriptions.

Example 4.1. Consider the concepts and quality dimensions of the leaf con-ceptual space in Example 3.5. A new observed leaf can be either linguisticallyrepresented by a known concept (e.g., Cno) where “The new observation is aNerium leaf.”, or by a set of related quality dimensions (e.g., qel and qro), suchas “The new observation is an elongated and lance-shaped leaf.”

Figure 4.1 illustrates the step of inferring linguistic descriptions for an un-known observation through the constructed conceptual spaces and its corre-sponding symbol space. The details of each component are explained in Sec-tion 4.2, after formally defining the symbol space in Section 4.1.

4.1 Symbol Space Definition

According to a general formulation proposed by Aisbett and Gibbon [14], aconceptual space can be augmented with a symbol space. This extension pro-vides an internal mapping between geometrical elements in conceptual space

54CHAPTER4.SEMANTICINFERENCE

Figure4.1:Illustrationofthestepsofinferringlinguisticdescriptionsforanun-knownobservationviatheconstructedconceptualspacesanditscorrespond-ingsymbolspace.ThedetailsofthesemanticinferencestepisexplainedinSection4.2.

•Definingthesymbolspace,basedonthepriorknowledgeforlinguisticcharacterisationoftheconceptsandqualitydimensions,

•Inferringlinguisticdescriptions,foreachnewunknownobservation,basedontheinclusionofitscorrespondinginstanceintheconcepts:

–Inferenceinconceptualspace:specifyingthegeometricallocationofanewinstancewithintheconceptualspace,examiningtheinclusionoftheinstance,anddeterminingthelinguisticlabelsinthesymbolspacebasedontheassociatedconceptsanddimensions,

–Inferenceinthesymbolspace:annotatingandcharacterisingthein-stancefromtheprovidedsetofsymbolicterms,andgeneratinglin-guisticdescriptions.

Example4.1.Considertheconceptsandqualitydimensionsoftheleafcon-ceptualspaceinExample3.5.Anewobservedleafcanbeeitherlinguisticallyrepresentedbyaknownconcept(e.g.,Cno)where“ThenewobservationisaNeriumleaf.”,orbyasetofrelatedqualitydimensions(e.g.,qelandqro),suchas“Thenewobservationisanelongatedandlance-shapedleaf.”

Figure4.1illustratesthestepofinferringlinguisticdescriptionsforanun-knownobservationthroughtheconstructedconceptualspacesanditscorre-spondingsymbolspace.ThedetailsofeachcomponentareexplainedinSec-tion4.2,afterformallydefiningthesymbolspaceinSection4.1.

4.1SymbolSpaceDefinition

AccordingtoageneralformulationproposedbyAisbettandGibbon[14],aconceptualspacecanbeaugmentedwithasymbolspace.Thisextensionpro-videsaninternalmappingbetweengeometricalelementsinconceptualspace


Figure 4.1: Illustration of the steps of inferring linguistic descriptions for an un-known observation via the constructed conceptual spaces and its correspond-ing symbol space. The details of the semantic inference step is explained inSection 4.2.

• Defining the symbol space, based on the prior knowledge for linguisticcharacterisation of the concepts and quality dimensions,

• Inferring linguistic descriptions, for each new unknown observation,based on the inclusion of its corresponding instance in the concepts:

– Inference in conceptual space: specifying the geometrical location ofa new instance within the conceptual space, examining the inclusionof the instance, and determining the linguistic labels in the symbolspace based on the associated concepts and dimensions,

– Inference in the symbol space: annotating and characterising the in-stance from the provided set of symbolic terms, and generating lin-guistic descriptions.

Example 4.1. Consider the concepts and quality dimensions of the leaf con-ceptual space in Example 3.5. A new observed leaf can be either linguisticallyrepresented by a known concept (e.g., Cno) where “The new observation is aNerium leaf.”, or by a set of related quality dimensions (e.g., qel and qro), suchas “The new observation is an elongated and lance-shaped leaf.”

Figure 4.1 illustrates the step of inferring linguistic descriptions for an un-known observation through the constructed conceptual spaces and its corre-sponding symbol space. The details of each component are explained in Sec-tion 4.2, after formally defining the symbol space in Section 4.1.

4.1 Symbol Space Definition

According to a general formulation proposed by Aisbett and Gibbon [14], aconceptual space can be augmented with a symbol space. This extension pro-vides an internal mapping between geometrical elements in conceptual space


Figure4.1:Illustrationofthestepsofinferringlinguisticdescriptionsforanun-knownobservationviatheconstructedconceptualspacesanditscorrespond-ingsymbolspace.ThedetailsofthesemanticinferencestepisexplainedinSection4.2.

•Definingthesymbolspace,basedonthepriorknowledgeforlinguisticcharacterisationoftheconceptsandqualitydimensions,

•Inferringlinguisticdescriptions,foreachnewunknownobservation,basedontheinclusionofitscorrespondinginstanceintheconcepts:

–Inferenceinconceptualspace:specifyingthegeometricallocationofanewinstancewithintheconceptualspace,examiningtheinclusionoftheinstance,anddeterminingthelinguisticlabelsinthesymbolspacebasedontheassociatedconceptsanddimensions,

–Inferenceinthesymbolspace:annotatingandcharacterisingthein-stancefromtheprovidedsetofsymbolicterms,andgeneratinglin-guisticdescriptions.

Example4.1.Considertheconceptsandqualitydimensionsoftheleafcon-ceptualspaceinExample3.5.Anewobservedleafcanbeeitherlinguisticallyrepresentedbyaknownconcept(e.g.,Cno)where“ThenewobservationisaNeriumleaf.”,orbyasetofrelatedqualitydimensions(e.g.,qelandqro),suchas“Thenewobservationisanelongatedandlance-shapedleaf.”

Figure4.1illustratesthestepofinferringlinguisticdescriptionsforanun-knownobservationthroughtheconstructedconceptualspacesanditscorre-spondingsymbolspace.ThedetailsofeachcomponentareexplainedinSec-tion4.2,afterformallydefiningthesymbolspaceinSection4.1.

4.1SymbolSpaceDefinition

AccordingtoageneralformulationproposedbyAisbettandGibbon[14],aconceptualspacecanbeaugmentedwithasymbolspace.Thisextensionpro-videsaninternalmappingbetweengeometricalelementsinconceptualspace

4.1. SYMBOL SPACE DEFINITION 55

Conceptual Space

C = {C1,C2 . . .}

Q = {q1,q2 . . .}

Symbol Space

Concept LayerLC =

[dC1 , dC2 , . . .

]

Quality LayerLQ =

[dq1 , dq2 , . . .

]

Figure 4.2: Schematic of a conceptual space and the coupled symbol space.

(such as concepts, dimensions, domains, etc.) and the symbolic labels (typicallywords) in symbol space.

Definition 4.1. A symbol space S of size n is a space containing n symboldimensions LS, wherein each concept and quality dimension in the conceptualspace is linked to a symbol dimension. Symbol dimensions are isomorphic tothe real number interval [0, 1].

Each concept and quality dimension in the conceptual space is linked toa symbol dimension in the symbolic space. Based on the definition of Aisbettand Gibbon, the symbol dimensions need to be named by the primitive inputlabels of the associated concepts (classes) and quality dimensions (features). Theconstruction of the symbol space is a knowledge-based process, wherein theprior knowledge is encoded [14]. The prior knowledge specifies the symbolicexpressions in natural language form, related to the elements of conceptualspace S.

This study proposes a two-layer symbol space containing the symbol dimen-sions of the concepts, LC =

[dC1 , dC2 , . . .

], as concept layer, and the symbol

dimensions of quality dimensions, LQ =[dq1 , dq2 , . . .

], as quality layer. The

symbol dimensions LC and LQ are acquired from the set of input labels Y, andset of selected features F ′, respectively. So, for every concept Cy ∈ C, there is asymbol dimension in the concept layer, and for each quality dimension q ∈ Q,there is a symbol dimension in the quality layer. Figure 4.2 shows a schematicpresentation of the associations between the elements in a conceptual space andthe two-layer symbol dimensions in a symbol space.

Example 4.2. Consider the conceptual space of leaves Sl in Example 3.5. Theassociated symbol space Sl is defined with two-layer symbol dimensions, de-noted by LC =

[dtt : label(Ctt), dno : label(Cno)

]in the concept layer, and

LQ =[del : label(qel), dro : label(qro)

]in the quality layer.

Any instance in a conceptual space is then associated with a point in thesymbol space, namely a symbol vector. For a given instance γ, the associatedsymbol vector Vγ in S specifies the applicability of the symbol dimensions forγ in the range 0 to 1 for each dimension [14]. The symbol vector Vγ is a

4.1.SYMBOLSPACEDEFINITION55

ConceptualSpace

C={C1,C2...}

Q={q1,q2...}

SymbolSpace

ConceptLayerL

C=[dC1,dC2,...]

QualityLayerL

Q=[dq1,dq2,...]

Figure4.2:Schematicofaconceptualspaceandthecoupledsymbolspace.

(suchasconcepts,dimensions,domains,etc.)andthesymboliclabels(typicallywords)insymbolspace.

Definition4.1.AsymbolspaceSofsizenisaspacecontainingnsymboldimensionsL

S,whereineachconceptandqualitydimensionintheconceptual

spaceislinkedtoasymboldimension.Symboldimensionsareisomorphictotherealnumberinterval[0,1].

Eachconceptandqualitydimensionintheconceptualspaceislinkedtoasymboldimensioninthesymbolicspace.BasedonthedefinitionofAisbettandGibbon,thesymboldimensionsneedtobenamedbytheprimitiveinputlabelsoftheassociatedconcepts(classes)andqualitydimensions(features).Theconstructionofthesymbolspaceisaknowledge-basedprocess,whereinthepriorknowledgeisencoded[14].Thepriorknowledgespecifiesthesymbolicexpressionsinnaturallanguageform,relatedtotheelementsofconceptualspaceS.

Thisstudyproposesatwo-layersymbolspacecontainingthesymboldimen-sionsoftheconcepts,L

C=[dC1,dC2,...],asconceptlayer,andthesymbol

dimensionsofqualitydimensions,LQ=[dq1,dq2,...],asqualitylayer.The

symboldimensionsLC

andLQ

areacquiredfromthesetofinputlabelsY,andsetofselectedfeaturesF′,respectively.So,foreveryconceptCy∈C,thereisasymboldimensionintheconceptlayer,andforeachqualitydimensionq∈Q,thereisasymboldimensioninthequalitylayer.Figure4.2showsaschematicpresentationoftheassociationsbetweentheelementsinaconceptualspaceandthetwo-layersymboldimensionsinasymbolspace.

Example4.2.ConsidertheconceptualspaceofleavesSl

inExample3.5.TheassociatedsymbolspaceS

lisdefinedwithtwo-layersymboldimensions,de-

notedbyLC=[dtt:label(Ctt),dno:label(Cno)]intheconceptlayer,and

LQ=[del:label(qel),dro:label(qro)]inthequalitylayer.

Anyinstanceinaconceptualspaceisthenassociatedwithapointinthesymbolspace,namelyasymbolvector.Foragiveninstanceγ,theassociatedsymbolvectorVγinSspecifiestheapplicabilityofthesymboldimensionsforγintherange0to1foreachdimension[14].ThesymbolvectorVγisa

4.1. SYMBOL SPACE DEFINITION 55

Conceptual Space

C = {C1,C2 . . .}

Q = {q1,q2 . . .}

Symbol Space

Concept LayerLC =

[dC1 , dC2 , . . .

]

Quality LayerLQ =

[dq1 , dq2 , . . .

]

Figure 4.2: Schematic of a conceptual space and the coupled symbol space.

(such as concepts, dimensions, domains, etc.) and the symbolic labels (typicallywords) in symbol space.

Definition 4.1. A symbol space S of size n is a space containing n symboldimensions LS, wherein each concept and quality dimension in the conceptualspace is linked to a symbol dimension. Symbol dimensions are isomorphic tothe real number interval [0, 1].

Each concept and quality dimension in the conceptual space is linked toa symbol dimension in the symbolic space. Based on the definition of Aisbettand Gibbon, the symbol dimensions need to be named by the primitive inputlabels of the associated concepts (classes) and quality dimensions (features). Theconstruction of the symbol space is a knowledge-based process, wherein theprior knowledge is encoded [14]. The prior knowledge specifies the symbolicexpressions in natural language form, related to the elements of conceptualspace S.

This study proposes a two-layer symbol space containing the symbol dimen-sions of the concepts, LC =

[dC1 , dC2 , . . .

], as concept layer, and the symbol

dimensions of quality dimensions, LQ =[dq1 , dq2 , . . .

], as quality layer. The

symbol dimensions LC and LQ are acquired from the set of input labels Y, andset of selected features F ′, respectively. So, for every concept Cy ∈ C, there is asymbol dimension in the concept layer, and for each quality dimension q ∈ Q,there is a symbol dimension in the quality layer. Figure 4.2 shows a schematicpresentation of the associations between the elements in a conceptual space andthe two-layer symbol dimensions in a symbol space.

Example 4.2. Consider the conceptual space of leaves Sl in Example 3.5. Theassociated symbol space Sl is defined with two-layer symbol dimensions, de-noted by LC =

[dtt : label(Ctt), dno : label(Cno)

]in the concept layer, and

LQ =[del : label(qel), dro : label(qro)

]in the quality layer.

Any instance in a conceptual space is then associated with a point in thesymbol space, namely a symbol vector. For a given instance γ, the associatedsymbol vector Vγ in S specifies the applicability of the symbol dimensions forγ in the range 0 to 1 for each dimension [14]. The symbol vector Vγ is a

4.1.SYMBOLSPACEDEFINITION55

ConceptualSpace

C={C1,C2...}

Q={q1,q2...}

SymbolSpace

ConceptLayerL

C=[dC1,dC2,...]

QualityLayerL

Q=[dq1,dq2,...]

Figure4.2:Schematicofaconceptualspaceandthecoupledsymbolspace.

(suchasconcepts,dimensions,domains,etc.)andthesymboliclabels(typicallywords)insymbolspace.

Definition4.1.AsymbolspaceSofsizenisaspacecontainingnsymboldimensionsL

S,whereineachconceptandqualitydimensionintheconceptual

spaceislinkedtoasymboldimension.Symboldimensionsareisomorphictotherealnumberinterval[0,1].

Eachconceptandqualitydimensionintheconceptualspaceislinkedtoasymboldimensioninthesymbolicspace.BasedonthedefinitionofAisbettandGibbon,thesymboldimensionsneedtobenamedbytheprimitiveinputlabelsoftheassociatedconcepts(classes)andqualitydimensions(features).Theconstructionofthesymbolspaceisaknowledge-basedprocess,whereinthepriorknowledgeisencoded[14].Thepriorknowledgespecifiesthesymbolicexpressionsinnaturallanguageform,relatedtotheelementsofconceptualspaceS.

Thisstudyproposesatwo-layersymbolspacecontainingthesymboldimen-sionsoftheconcepts,L

C=[dC1,dC2,...],asconceptlayer,andthesymbol

dimensionsofqualitydimensions,LQ=[dq1,dq2,...],asqualitylayer.The

symboldimensionsLC

andLQ

areacquiredfromthesetofinputlabelsY,andsetofselectedfeaturesF′,respectively.So,foreveryconceptCy∈C,thereisasymboldimensionintheconceptlayer,andforeachqualitydimensionq∈Q,thereisasymboldimensioninthequalitylayer.Figure4.2showsaschematicpresentationoftheassociationsbetweentheelementsinaconceptualspaceandthetwo-layersymboldimensionsinasymbolspace.


inExample3.5.TheassociatedsymbolspaceS

lisdefinedwithtwo-layersymboldimensions,de-

notedbyLC=[dtt:label(Ctt),dno:label(Cno)]intheconceptlayer,and

LQ=[del:label(qel),dro:label(qro)]inthequalitylayer.

Anyinstanceinaconceptualspaceisthenassociatedwithapointinthesymbolspace,namelyasymbolvector.Foragiveninstanceγ,theassociatedsymbolvectorVγinSspecifiestheapplicabilityofthesymboldimensionsforγintherange0to1foreachdimension[14].ThesymbolvectorVγisa


concatenation of two vectors Vγ :< VCγ,VQ

γ >, one vector in the concept layerand one vector in quality layer, respectively. Thus, |VC

γ| = |LC|, and |VQγ | = |LQ|.

Example 4.3. Consider the conceptual space of leaves Sl in Example 3.5. For anew leaf instance γ, the symbol vector Vγ is defined as a 4-dimensional vectorwith concatenation of VC

γ =< vdtt , vdno >, and ,VQγ =< vdel , vdro >.

The symbol vector is defined as a two-element vector, wherein each vi ∈ Vγ

consists of a pair of values vi = (label, value). The label shows the relatedsymbolic term and the value shows how representative is the instance to thedimension di (either how similar to its concepts or how related to the qualitydimensions). The notion VC

γ(Cy) = vdCyindicates the value of the symbol vector

in the concept layer for the dimension related to the concept Cy, and similarly,VQγ(qj) = vdqj

indicates the value of the symbol vector in the quality layer forthe dimension related to the quality dimension qj. The further sections explainhow the elements of a symbol vector for a new instance are assigned valuesbased on the inclusion of the instance within the domains.

4.2 Inferring Linguistic Descriptions for Unknown

Observations

For any given unknown observation, the goal is to infer a semantic descrip-tion in natural language form. The core of the inference process is to cope withthe notion of similarity in conceptual spaces. To place the new instances in thespace and choose the best concepts that include (or are similar enough to) anobservation [189]. The metric structure of a geometrical conceptual space en-ables the model to measure the semantic similarity of concepts and instancesin the space [9]. The proposed construction of the conceptual space in Chap-ter 3 facilitates these measurements since the representations of concepts and in-stances span across domains using the geometric elements, i.e., convex regionsand points. From the point of view of NLG, inferring linguistic descriptionsfor unknown observations covers the main tasks of an NLG pipeline for gener-ating natural language text out of non-linguistic data: Content determination,Microplanning (including lexicalisation), and Realisation [185]. This phase em-ploys various developed methods for linguistic descriptions (i.e., fuzzy set the-ory [177]) to ease the process of quantifying the location of unknown sampleswithin a conceptual space, and infer the proper linguistic terms.

The process of inferring linguistic descriptions for an instance γ ′ is pre-sented in two following phases.

• Phase A: Inference in Conceptual Space, that first determines the inclu-sion of the new instance γ ′ in the concepts within the domains Δ(γ ′)using semantic similarity, and then sets the values of symbol vector Vγ′

(performing the content determination task).


concatenationoftwovectorsVγ:<VCγ,V

Qγ>,onevectorintheconceptlayer

andonevectorinqualitylayer,respectively.Thus,|VCγ|=|L

C|,and|V

Qγ|=|L

Q|.


inExample3.5.Foranewleafinstanceγ,thesymbolvectorVγisdefinedasa4-dimensionalvectorwithconcatenationofV

Cγ=<vdtt,vdno>,and,V

Qγ=<vdel,vdro>.

Thesymbolvectorisdefinedasatwo-elementvector,whereineachvi∈Vγ

consistsofapairofvaluesvi=(label,value).Thelabelshowstherelatedsymbolictermandthevalueshowshowrepresentativeistheinstancetothedimensiondi(eitherhowsimilartoitsconceptsorhowrelatedtothequalitydimensions).ThenotionV

Cγ(Cy)=vdCyindicatesthevalueofthesymbolvector

intheconceptlayerforthedimensionrelatedtotheconceptCy,andsimilarly,VQγ(qj)=vdqjindicatesthevalueofthesymbolvectorinthequalitylayerfor

thedimensionrelatedtothequalitydimensionqj.Thefurthersectionsexplainhowtheelementsofasymbolvectorforanewinstanceareassignedvaluesbasedontheinclusionoftheinstancewithinthedomains.

4.2InferringLinguisticDescriptionsforUnknown

Observations

Foranygivenunknownobservation,thegoalistoinferasemanticdescrip-tioninnaturallanguageform.Thecoreoftheinferenceprocessistocopewiththenotionofsimilarityinconceptualspaces.Toplacethenewinstancesinthespaceandchoosethebestconceptsthatinclude(oraresimilarenoughto)anobservation[189].Themetricstructureofageometricalconceptualspaceen-ablesthemodeltomeasurethesemanticsimilarityofconceptsandinstancesinthespace[9].TheproposedconstructionoftheconceptualspaceinChap-ter3facilitatesthesemeasurementssincetherepresentationsofconceptsandin-stancesspanacrossdomainsusingthegeometricelements,i.e.,convexregionsandpoints.FromthepointofviewofNLG,inferringlinguisticdescriptionsforunknownobservationscoversthemaintasksofanNLGpipelineforgener-atingnaturallanguagetextoutofnon-linguisticdata:Contentdetermination,Microplanning(includinglexicalisation),andRealisation[185].Thisphaseem-ploysvariousdevelopedmethodsforlinguisticdescriptions(i.e.,fuzzysetthe-ory[177])toeasetheprocessofquantifyingthelocationofunknownsampleswithinaconceptualspace,andinfertheproperlinguisticterms.

Theprocessofinferringlinguisticdescriptionsforaninstanceγ′ispre-sentedintwofollowingphases.

•PhaseA:InferenceinConceptualSpace,thatfirstdeterminestheinclu-sionofthenewinstanceγ′intheconceptswithinthedomainsΔ(γ′)usingsemanticsimilarity,andthensetsthevaluesofsymbolvectorVγ′

(performingthecontentdeterminationtask).


concatenation of two vectors Vγ :< VCγ,VQ

γ >, one vector in the concept layerand one vector in quality layer, respectively. Thus, |VC

γ| = |LC|, and |VQγ | = |LQ|.

Example 4.3. Consider the conceptual space of leaves Sl in Example 3.5. For anew leaf instance γ, the symbol vector Vγ is defined as a 4-dimensional vectorwith concatenation of VC

γ =< vdtt , vdno >, and ,VQγ =< vdel , vdro >.

The symbol vector is defined as a two-element vector, wherein each vi ∈ Vγ

consists of a pair of values vi = (label, value). The label shows the relatedsymbolic term and the value shows how representative is the instance to thedimension di (either how similar to its concepts or how related to the qualitydimensions). The notion VC

γ(Cy) = vdCyindicates the value of the symbol vector

in the concept layer for the dimension related to the concept Cy, and similarly,VQγ(qj) = vdqj

indicates the value of the symbol vector in the quality layer forthe dimension related to the quality dimension qj. The further sections explainhow the elements of a symbol vector for a new instance are assigned valuesbased on the inclusion of the instance within the domains.

4.2 Inferring Linguistic Descriptions for Unknown

Observations

For any given unknown observation, the goal is to infer a semantic descrip-tion in natural language form. The core of the inference process is to cope withthe notion of similarity in conceptual spaces. To place the new instances in thespace and choose the best concepts that include (or are similar enough to) anobservation [189]. The metric structure of a geometrical conceptual space en-ables the model to measure the semantic similarity of concepts and instancesin the space [9]. The proposed construction of the conceptual space in Chap-ter 3 facilitates these measurements since the representations of concepts and in-stances span across domains using the geometric elements, i.e., convex regionsand points. From the point of view of NLG, inferring linguistic descriptionsfor unknown observations covers the main tasks of an NLG pipeline for gener-ating natural language text out of non-linguistic data: Content determination,Microplanning (including lexicalisation), and Realisation [185]. This phase em-ploys various developed methods for linguistic descriptions (i.e., fuzzy set the-ory [177]) to ease the process of quantifying the location of unknown sampleswithin a conceptual space, and infer the proper linguistic terms.

The process of inferring linguistic descriptions for an instance γ ′ is pre-sented in two following phases.

• Phase A: Inference in Conceptual Space, that first determines the inclu-sion of the new instance γ ′ in the concepts within the domains Δ(γ ′)using semantic similarity, and then sets the values of symbol vector Vγ′

(performing the content determination task).


concatenationoftwovectorsVγ:<VCγ,V

Qγ>,onevectorintheconceptlayer

andonevectorinqualitylayer,respectively.Thus,|VCγ|=|L

C|,and|V

Qγ|=|L

Q|.


inExample3.5.Foranewleafinstanceγ,thesymbolvectorVγisdefinedasa4-dimensionalvectorwithconcatenationofV

Cγ=<vdtt,vdno>,and,V

Qγ=<vdel,vdro>.

Thesymbolvectorisdefinedasatwo-elementvector,whereineachvi∈Vγ

consistsofapairofvaluesvi=(label,value).Thelabelshowstherelatedsymbolictermandthevalueshowshowrepresentativeistheinstancetothedimensiondi(eitherhowsimilartoitsconceptsorhowrelatedtothequalitydimensions).ThenotionV

Cγ(Cy)=vdCyindicatesthevalueofthesymbolvector

intheconceptlayerforthedimensionrelatedtotheconceptCy,andsimilarly,VQγ(qj)=vdqjindicatesthevalueofthesymbolvectorinthequalitylayerfor

thedimensionrelatedtothequalitydimensionqj.Thefurthersectionsexplainhowtheelementsofasymbolvectorforanewinstanceareassignedvaluesbasedontheinclusionoftheinstancewithinthedomains.

4.2InferringLinguisticDescriptionsforUnknown

Observations

Foranygivenunknownobservation,thegoalistoinferasemanticdescrip-tioninnaturallanguageform.Thecoreoftheinferenceprocessistocopewiththenotionofsimilarityinconceptualspaces.Toplacethenewinstancesinthespaceandchoosethebestconceptsthatinclude(oraresimilarenoughto)anobservation[189].Themetricstructureofageometricalconceptualspaceen-ablesthemodeltomeasurethesemanticsimilarityofconceptsandinstancesinthespace[9].TheproposedconstructionoftheconceptualspaceinChap-ter3facilitatesthesemeasurementssincetherepresentationsofconceptsandin-stancesspanacrossdomainsusingthegeometricelements,i.e.,convexregionsandpoints.FromthepointofviewofNLG,inferringlinguisticdescriptionsforunknownobservationscoversthemaintasksofanNLGpipelineforgener-atingnaturallanguagetextoutofnon-linguisticdata:Contentdetermination,Microplanning(includinglexicalisation),andRealisation[185].Thisphaseem-ploysvariousdevelopedmethodsforlinguisticdescriptions(i.e.,fuzzysetthe-ory[177])toeasetheprocessofquantifyingthelocationofunknownsampleswithinaconceptualspace,andinfertheproperlinguisticterms.

Theprocessofinferringlinguisticdescriptionsforaninstanceγ′ispre-sentedintwofollowingphases.

•PhaseA:InferenceinConceptualSpace,thatfirstdeterminestheinclu-sionofthenewinstanceγ′intheconceptswithinthedomainsΔ(γ′)usingsemanticsimilarity,andthensetsthevaluesofsymbolvectorVγ′

(performingthecontentdeterminationtask).

4.2. INFERRING LINGUISTIC DESCRIPTIONS 57

Symbol SpaceInference

(algorithm 4.2)

ConceptualSpace Inference(algorithm 4.1)

γ ′,Vγ′o ′ ∈ D ′

S : 〈Q,Δ,C, Γ〉Tγ′(o ′)

Semantic Inference of Linguistic Descriptions

Figure 4.3: Two phases of the semantic inference for generating linguistic de-scriptions, with the input and output parameters of each phase.

• Phase B: Inference in Symbol Space, that verbalises the symbol vectorVγ′ into a set of lexical items which are human-readable descriptions.(performing the lexicalisation and realisation tasks).

For a given set of new observations D ′ = {o ′i : (xo′

i)}, let γ ′ be the corre-

sponding instance to the unknown observation o ′ ∈ D ′ which is not assignedto any of the known class labels of Y. Also, let Δ(γ ′) ⊆ Δ be a set of domainsthat γ ′ has corresponding points in each of them1, where |Δ(γ ′)| = k ′.

Figure 4.3 illustrates the phases of inferring a linguistic description for anew observation, with their input and output parameters. The details of thephases A and B are explained in the following sections.

4.2.1 Phase A: Inference in Conceptual Space (Content

Determination)

This phase first presents the comparison of the new instance with each of theconcepts using the similarity measure to check whether it is included withinconcept regions or not. Then, based on the result of this inclusion, it describeshow to initialise a symbol vector and set its values in both concept layer andquality layer. From the NLG perspective, this performs the task of content de-termination [182], that decides which set of information is required to charac-terise a new observation in the final description. The process of checking theinclusion is a simple fuzzy extension of an instance-based method that measuresthe membership of the new instance to the nearest labelled region of instances.Although any classification approach can do it, the process is formulated herewith respect to the definitions of the introduced conceptual space.

For an instance γ ′, the symbol vector Vγ′ is calculated based on the in-clusion of the instance points pγ′ in different regions within the domains. Asdefined before, γ ′ is represented with a set of points γ ′ = {p1

γ′ , . . . ,pk′γ′ }. In

1It is notable that a new observation is not necessarily defined in all domains, since there mightbe no calculated values for some of the features/quality dimensions. So, the corresponding instancemay not have points in all the provided domains.

4.2.INFERRINGLINGUISTICDESCRIPTIONS57

SymbolSpaceInference

(algorithm4.2)

ConceptualSpaceInference(algorithm4.1)

γ′,Vγ′ o′∈D′

S:〈Q,Δ,C,Γ〉Tγ′(o′)

SemanticInferenceofLinguisticDescriptions

Figure4.3:Twophasesofthesemanticinferenceforgeneratinglinguisticde-scriptions,withtheinputandoutputparametersofeachphase.

•PhaseB:InferenceinSymbolSpace,thatverbalisesthesymbolvectorVγ′intoasetoflexicalitemswhicharehuman-readabledescriptions.(performingthelexicalisationandrealisationtasks).

ForagivensetofnewobservationsD′={o′i:(xo′

i)},letγ′bethecorre-spondinginstancetotheunknownobservationo′∈D′whichisnotassignedtoanyoftheknownclasslabelsofY.Also,letΔ(γ′)⊆Δbeasetofdomainsthatγ′hascorrespondingpointsineachofthem1,where|Δ(γ′)|=k′.

Figure4.3illustratesthephasesofinferringalinguisticdescriptionforanewobservation,withtheirinputandoutputparameters.ThedetailsofthephasesAandBareexplainedinthefollowingsections.

4.2.1PhaseA:InferenceinConceptualSpace(Content

Determination)

Thisphasefirstpresentsthecomparisonofthenewinstancewitheachoftheconceptsusingthesimilaritymeasuretocheckwhetheritisincludedwithinconceptregionsornot.Then,basedontheresultofthisinclusion,itdescribeshowtoinitialiseasymbolvectorandsetitsvaluesinbothconceptlayerandqualitylayer.FromtheNLGperspective,thisperformsthetaskofcontentde-termination[182],thatdecideswhichsetofinformationisrequiredtocharac-teriseanewobservationinthefinaldescription.Theprocessofcheckingtheinclusionisasimplefuzzyextensionofaninstance-basedmethodthatmeasuresthemembershipofthenewinstancetothenearestlabelledregionofinstances.Althoughanyclassificationapproachcandoit,theprocessisformulatedherewithrespecttothedefinitionsoftheintroducedconceptualspace.

Foraninstanceγ′,thesymbolvectorVγ′iscalculatedbasedonthein-clusionoftheinstancepointspγ′indifferentregionswithinthedomains.Asdefinedbefore,γ′isrepresentedwithasetofpointsγ′={p1

γ′,...,pk′γ′}.In

1Itisnotablethatanewobservationisnotnecessarilydefinedinalldomains,sincetheremightbenocalculatedvaluesforsomeofthefeatures/qualitydimensions.So,thecorrespondinginstancemaynothavepointsinalltheprovideddomains.


Symbol SpaceInference

(algorithm 4.2)

ConceptualSpace Inference(algorithm 4.1)

γ ′,Vγ′o ′ ∈ D ′

S : 〈Q,Δ,C, Γ〉Tγ′(o ′)

Semantic Inference of Linguistic Descriptions

Figure 4.3: Two phases of the semantic inference for generating linguistic de-scriptions, with the input and output parameters of each phase.

• Phase B: Inference in Symbol Space, that verbalises the symbol vectorVγ′ into a set of lexical items which are human-readable descriptions.(performing the lexicalisation and realisation tasks).

For a given set of new observations D ′ = {o ′i : (xo′

i)}, let γ ′ be the corre-

sponding instance to the unknown observation o ′ ∈ D ′ which is not assignedto any of the known class labels of Y. Also, let Δ(γ ′) ⊆ Δ be a set of domainsthat γ ′ has corresponding points in each of them1, where |Δ(γ ′)| = k ′.

Figure 4.3 illustrates the phases of inferring a linguistic description for anew observation, with their input and output parameters. The details of thephases A and B are explained in the following sections.

4.2.1 Phase A: Inference in Conceptual Space (Content

Determination)

This phase first presents the comparison of the new instance with each of theconcepts using the similarity measure to check whether it is included withinconcept regions or not. Then, based on the result of this inclusion, it describeshow to initialise a symbol vector and set its values in both concept layer andquality layer. From the NLG perspective, this performs the task of content de-termination [182], that decides which set of information is required to charac-terise a new observation in the final description. The process of checking theinclusion is a simple fuzzy extension of an instance-based method that measuresthe membership of the new instance to the nearest labelled region of instances.Although any classification approach can do it, the process is formulated herewith respect to the definitions of the introduced conceptual space.

For an instance γ ′, the symbol vector Vγ′ is calculated based on the in-clusion of the instance points pγ′ in different regions within the domains. Asdefined before, γ ′ is represented with a set of points γ ′ = {p1

γ′ , . . . ,pk′γ′ }. In

1It is notable that a new observation is not necessarily defined in all domains, since there mightbe no calculated values for some of the features/quality dimensions. So, the corresponding instancemay not have points in all the provided domains.


SymbolSpaceInference

(algorithm4.2)

ConceptualSpaceInference(algorithm4.1)

γ′,Vγ′ o′∈D′

S:〈Q,Δ,C,Γ〉Tγ′(o′)

SemanticInferenceofLinguisticDescriptions

Figure4.3:Twophasesofthesemanticinferenceforgeneratinglinguisticde-scriptions,withtheinputandoutputparametersofeachphase.

•PhaseB:InferenceinSymbolSpace,thatverbalisesthesymbolvectorVγ′intoasetoflexicalitemswhicharehuman-readabledescriptions.(performingthelexicalisationandrealisationtasks).

ForagivensetofnewobservationsD′={o′i:(xo′

i)},letγ′bethecorre-spondinginstancetotheunknownobservationo′∈D′whichisnotassignedtoanyoftheknownclasslabelsofY.Also,letΔ(γ′)⊆Δbeasetofdomainsthatγ′hascorrespondingpointsineachofthem1,where|Δ(γ′)|=k′.

Figure4.3illustratesthephasesofinferringalinguisticdescriptionforanewobservation,withtheirinputandoutputparameters.ThedetailsofthephasesAandBareexplainedinthefollowingsections.

4.2.1PhaseA:InferenceinConceptualSpace(Content

Determination)

Thisphasefirstpresentsthecomparisonofthenewinstancewitheachoftheconceptsusingthesimilaritymeasuretocheckwhetheritisincludedwithinconceptregionsornot.Then,basedontheresultofthisinclusion,itdescribeshowtoinitialiseasymbolvectorandsetitsvaluesinbothconceptlayerandqualitylayer.FromtheNLGperspective,thisperformsthetaskofcontentde-termination[182],thatdecideswhichsetofinformationisrequiredtocharac-teriseanewobservationinthefinaldescription.Theprocessofcheckingtheinclusionisasimplefuzzyextensionofaninstance-basedmethodthatmeasuresthemembershipofthenewinstancetothenearestlabelledregionofinstances.Althoughanyclassificationapproachcandoit,theprocessisformulatedherewithrespecttothedefinitionsoftheintroducedconceptualspace.

Foraninstanceγ′,thesymbolvectorVγ′iscalculatedbasedonthein-clusionoftheinstancepointspγ′indifferentregionswithinthedomains.Asdefinedbefore,γ′isrepresentedwithasetofpointsγ′={p1

γ′,...,pk′γ′}.In

1Itisnotablethatanewobservationisnotnecessarilydefinedinalldomains,sincetheremightbenocalculatedvaluesforsomeofthefeatures/qualitydimensions.So,thecorrespondinginstancemaynothavepointsinalltheprovideddomains.


general, with placing a new instance in a conceptual space, four cases can oc-cur. Without losing generality, assume γ ′ consists of two points in two domainsδa and δb, denoted by γ ′ = {pa

γ,pbγ}. Also assume that there are two concepts

Cy1 and Cy2 that have been represented in one or both of these two domains.Figure 4.4 shows the four different cases with respect to various positions ofthe points pa and pb, and their relations to the sub-concepts’ regions withinthe domains. One instance can be located in the space differently as follows:

1. Totally included in a concept within all the domains (case one, Fig-ure 4.4a),

2. Partially included in just a concept (case two, Figure 4.4b),3. Partially included in two distinct concepts (case three, Figure 4.4c),4. Not included in any concept (case four, Figure 4.4d).

(a) Case one: γ ′ is totally included in concept Cyi(in all the domains).


general,withplacinganewinstanceinaconceptualspace,fourcasescanoc-cur.Withoutlosinggenerality,assumeγ′consistsoftwopointsintwodomainsδaandδb,denotedbyγ′={p

aγ,p

bγ}.Alsoassumethattherearetwoconcepts

Cy1andCy2thathavebeenrepresentedinoneorbothofthesetwodomains.Figure4.4showsthefourdifferentcaseswithrespecttovariouspositionsofthepointsp

aandp

b,andtheirrelationstothesub-concepts’regionswithin

thedomains.Oneinstancecanbelocatedinthespacedifferentlyasfollows:

1.Totallyincludedinaconceptwithinallthedomains(caseone,Fig-ure4.4a),

2.Partiallyincludedinjustaconcept(casetwo,Figure4.4b),3.Partiallyincludedintwodistinctconcepts(casethree,Figure4.4c),4.Notincludedinanyconcept(casefour,Figure4.4d).

(a)Caseone:γ′istotallyincludedinconceptCyi(inallthedomains).


general, with placing a new instance in a conceptual space, four cases can oc-cur. Without losing generality, assume γ ′ consists of two points in two domainsδa and δb, denoted by γ ′ = {pa

γ,pbγ}. Also assume that there are two concepts

Cy1 and Cy2 that have been represented in one or both of these two domains.Figure 4.4 shows the four different cases with respect to various positions ofthe points pa and pb, and their relations to the sub-concepts’ regions withinthe domains. One instance can be located in the space differently as follows:

1. Totally included in a concept within all the domains (case one, Fig-ure 4.4a),

2. Partially included in just a concept (case two, Figure 4.4b),3. Partially included in two distinct concepts (case three, Figure 4.4c),4. Not included in any concept (case four, Figure 4.4d).

(a) Case one: γ ′ is totally included in concept Cyi(in all the domains).


general,withplacinganewinstanceinaconceptualspace,fourcasescanoc-cur.Withoutlosinggenerality,assumeγ′consistsoftwopointsintwodomainsδaandδb,denotedbyγ′={p

aγ,p

bγ}.Alsoassumethattherearetwoconcepts

Cy1andCy2thathavebeenrepresentedinoneorbothofthesetwodomains.Figure4.4showsthefourdifferentcaseswithrespecttovariouspositionsofthepointsp

aandp

b,andtheirrelationstothesub-concepts’regionswithin

thedomains.Oneinstancecanbelocatedinthespacedifferentlyasfollows:

1.Totallyincludedinaconceptwithinallthedomains(caseone,Fig-ure4.4a),

2.Partiallyincludedinjustaconcept(casetwo,Figure4.4b),3.Partiallyincludedintwodistinctconcepts(casethree,Figure4.4c),4.Notincludedinanyconcept(casefour,Figure4.4d).

(a)Caseone:γ′istotallyincludedinconceptCyi(inallthedomains).


(b) Case two: γ ′ is partially included in concept Cyi(in one domain).

(c) Case three: γ ′ is partially included in two concepts Cyiand Cyj

.


(b)Casetwo:γ′ispartiallyincludedinconceptCyi(inonedomain).

(c)Casethree:γ′ispartiallyincludedintwoconceptsCyiandCyj.


(b) Case two: γ ′ is partially included in concept Cyi(in one domain).

(c) Case three: γ ′ is partially included in two concepts Cyiand Cyj

.


(b)Casetwo:γ′ispartiallyincludedinconceptCyi(inonedomain).

(c)Casethree:γ′ispartiallyincludedintwoconceptsCyiandCyj.


(d) Case four: γ ′ is not included in any concept.

Figure 4.4: An illustration of four different cases with respect to the variouspositions of an instance points γ ′ = {pa,pb} within two domains δa and δb,together with the assigned values to symbol vector Vγ′ , according to the inclu-sion of the points in the presented sub-concepts.

To assign a concept to γ ′, it is necessary to check the inclusion of the in-stance points in the concept’s regions within Δ(γ ′) using a similarity measure.Within a single domain, two states need to be considered: 1) If the instancepoint is included in a region, then the region’s concept will be assigned to γ ′.So, the corresponding symbol dimension of the concept will be activated in thesymbol vector of γ ′ (in the concept layer). 2) If there is no region that the in-stance belongs to, then no concept will be assigned to γ ′ within that domain.The symbol dimensions related to the quality dimensions of the domain will beactivated in the symbol vector of γ ′ (in the quality layer). Formally, the symbolvector Vγ′ gets the values in each domain δi ∈ Δ(γ ′) as follows: First a func-tion called Graded Membership function, G(pi

γ′ , ci), is defined as an inclusionoperator to determine the similarity degree of a point pi

γ′ to the region of asub-concept ci ∈ δi within δi. If pi

γ′ is similar enough to the sub-concept’s con-vex region with a certain membership degree, then the value of G(pi

γ′ , ci) is setto VC

γ′(Cy), where Cy ci. Moreover, another function called Graded Qual-ity function, H(pi

γ′ ,qi), is defined to measure to what degree piγ′ belongs to a

quality dimension qi ∈ Q(δi). If piγ′ is not included in any of the sub-concepts,

then the value of H(piγ′ ,qi) is set to VQ

γ′(qi).


(d)Casefour:γ′isnotincludedinanyconcept.

Figure4.4:Anillustrationoffourdifferentcaseswithrespecttothevariouspositionsofaninstancepointsγ′={p

a,p

b}withintwodomainsδaandδb,

togetherwiththeassignedvaluestosymbolvectorVγ′,accordingtotheinclu-sionofthepointsinthepresentedsub-concepts.

Toassignaconcepttoγ′,itisnecessarytochecktheinclusionofthein-stancepointsintheconcept’sregionswithinΔ(γ′)usingasimilaritymeasure.Withinasingledomain,twostatesneedtobeconsidered:1)Iftheinstancepointisincludedinaregion,thentheregion’sconceptwillbeassignedtoγ′.So,thecorrespondingsymboldimensionoftheconceptwillbeactivatedinthesymbolvectorofγ′(intheconceptlayer).2)Ifthereisnoregionthatthein-stancebelongsto,thennoconceptwillbeassignedtoγ′withinthatdomain.Thesymboldimensionsrelatedtothequalitydimensionsofthedomainwillbeactivatedinthesymbolvectorofγ′(inthequalitylayer).Formally,thesymbolvectorVγ′getsthevaluesineachdomainδi∈Δ(γ′)asfollows:Firstafunc-tioncalledGradedMembershipfunction,G(p

iγ′,c

i),isdefinedasaninclusion

operatortodeterminethesimilaritydegreeofapointpiγ′totheregionofa

sub-conceptci∈δiwithinδi.Ifp

iγ′issimilarenoughtothesub-concept’scon-

vexregionwithacertainmembershipdegree,thenthevalueofG(piγ′,c

i)isset

toVCγ′(Cy),whereCy c

i.Moreover,anotherfunctioncalledGradedQual-

ityfunction,H(piγ′,q

i),isdefinedtomeasuretowhatdegreep

iγ′belongstoa

qualitydimensionqi∈Q(δi).Ifp

iγ′isnotincludedinanyofthesub-concepts,

thenthevalueofH(piγ′,q

i)issettoV

Qγ′(q

i).


(d) Case four: γ ′ is not included in any concept.

Figure 4.4: An illustration of four different cases with respect to the variouspositions of an instance points γ ′ = {pa,pb} within two domains δa and δb,together with the assigned values to symbol vector Vγ′ , according to the inclu-sion of the points in the presented sub-concepts.

To assign a concept to γ ′, it is necessary to check the inclusion of the in-stance points in the concept’s regions within Δ(γ ′) using a similarity measure.Within a single domain, two states need to be considered: 1) If the instancepoint is included in a region, then the region’s concept will be assigned to γ ′.So, the corresponding symbol dimension of the concept will be activated in thesymbol vector of γ ′ (in the concept layer). 2) If there is no region that the in-stance belongs to, then no concept will be assigned to γ ′ within that domain.The symbol dimensions related to the quality dimensions of the domain will beactivated in the symbol vector of γ ′ (in the quality layer). Formally, the symbolvector Vγ′ gets the values in each domain δi ∈ Δ(γ ′) as follows: First a func-tion called Graded Membership function, G(pi

γ′ , ci), is defined as an inclusionoperator to determine the similarity degree of a point pi

γ′ to the region of asub-concept ci ∈ δi within δi. If pi

γ′ is similar enough to the sub-concept’s con-vex region with a certain membership degree, then the value of G(pi

γ′ , ci) is setto VC

γ′(Cy), where Cy ci. Moreover, another function called Graded Qual-ity function, H(pi

γ′ ,qi), is defined to measure to what degree piγ′ belongs to a

quality dimension qi ∈ Q(δi). If piγ′ is not included in any of the sub-concepts,

then the value of H(piγ′ ,qi) is set to VQ

γ′(qi).


(d)Casefour:γ′isnotincludedinanyconcept.

Figure4.4:Anillustrationoffourdifferentcaseswithrespecttothevariouspositionsofaninstancepointsγ′={p

a,p

b}withintwodomainsδaandδb,

togetherwiththeassignedvaluestosymbolvectorVγ′,accordingtotheinclu-sionofthepointsinthepresentedsub-concepts.

Toassignaconcepttoγ′,itisnecessarytochecktheinclusionofthein-stancepointsintheconcept’sregionswithinΔ(γ′)usingasimilaritymeasure.Withinasingledomain,twostatesneedtobeconsidered:1)Iftheinstancepointisincludedinaregion,thentheregion’sconceptwillbeassignedtoγ′.So,thecorrespondingsymboldimensionoftheconceptwillbeactivatedinthesymbolvectorofγ′(intheconceptlayer).2)Ifthereisnoregionthatthein-stancebelongsto,thennoconceptwillbeassignedtoγ′withinthatdomain.Thesymboldimensionsrelatedtothequalitydimensionsofthedomainwillbeactivatedinthesymbolvectorofγ′(inthequalitylayer).Formally,thesymbolvectorVγ′getsthevaluesineachdomainδi∈Δ(γ′)asfollows:Firstafunc-tioncalledGradedMembershipfunction,G(p

iγ′,c

i),isdefinedasaninclusion

operatortodeterminethesimilaritydegreeofapointpiγ′totheregionofa

sub-conceptci∈δiwithinδi.Ifp

iγ′issimilarenoughtothesub-concept’scon-

vexregionwithacertainmembershipdegree,thenthevalueofG(piγ′,c

i)isset

toVCγ′(Cy),whereCy c

i.Moreover,anotherfunctioncalledGradedQual-

ityfunction,H(piγ′,q

i),isdefinedtomeasuretowhatdegreep

iγ′belongstoa

qualitydimensionqi∈Q(δi).Ifp

iγ′isnotincludedinanyofthesub-concepts,

thenthevalueofH(piγ′,q

i)issettoV

Qγ′(q

i).


Algorithm 4.1: Inference in Conceptual Space

Function ConceptualSpaceInference(γ ′,Q,Δ,C)foreach δi ∈ Δ(γ ′) do

p ← piγ′ ∈ γ ′

// Set the symbol vector in concept layerforeach c ∈ δi do

simγ′,Cy= max(simγ′,Cy

,G(p, c)) // Cy c

if simγ′,Cy= 0 // cases 1 and 3 (and partially 2)

thenlabelγ′,Cy

= label(Cy)else

labelγ′,Cy= ∅

VCγ′(Cy) = (labelγ′,Cy

, simγ′,Cy)

// Set the symbol vector in quality layerif VC

γ′(Cy) == (∅, 0) // case 4 (and partially 2)then

foreach q ∈ δi dodegreeγ′,q = H(p,q)labelγ′,q = label(Abest

γ′,q )

VQγ′(qi) ← (labelγ′,q,degreeγ′,q)

return Vγ′ : 〈 VCγ′ ,VQ

γ′ 〉

Example 4.4. Consider the instance γ ′ in Figure 4.4b. Point pb is included tothe region of cbyi

within δb. So, VCγ′(Cyi

) in concept layer will get the gradedmembership value between pb and cbyi

. But, point pa is not included in anyof the regions within δa. So, VQ

γ′(qa1 ) and VQ

γ′(qa2 ) in quality layer will get

the graded quality values between pa and two quality dimensions qa1 and qa

2 ,respectively. but pa is not included in any of the regions within δa. So, thesymbol vector Vγ′ will get values at three indices: VC

γ′(Cyi) in concept layer,

and VQγ′(qa

1 ) and VQγ′(qa

2 ) in quality layer.

Algorithm 4.1 shows the steps to set the symbol vector values while iteratingthrough all the involved domains Δ(γ ′). Both graded membership function andgraded quality function are formally defined in the following sections.

Graded Membership function

In Section 3.2, a sub-concept c is defined as a pair c : 〈ηc,φc〉, where ηc isthe convex region representing the geometrical area of the corresponding con-cept in domain δ, and φc is a set of weights showing the assigned degrees of


Algorithm4.1:InferenceinConceptualSpace

FunctionConceptualSpaceInference(γ′,Q,Δ,C)foreachδi∈Δ(γ′)do

p←piγ′∈γ′

//Setthesymbolvectorinconceptlayerforeachc∈δido

simγ′,Cy=max(simγ′,Cy,G(p,c))//Cy c

ifsimγ′,Cy=0//cases1and3(andpartially2)then

labelγ′,Cy=label(Cy)else

labelγ′,Cy=∅VCγ′(Cy)=(labelγ′,Cy,simγ′,Cy)

//SetthesymbolvectorinqualitylayerifV

Cγ′(Cy)==(∅,0)//case4(andpartially2)

thenforeachq∈δido

degreeγ′,q=H(p,q)labelγ′,q=label(A

bestγ′,q)

VQγ′(q

i)←(labelγ′,q,degreeγ′,q)

returnVγ′:〈VCγ′,V

Qγ′〉

Example4.4.Considertheinstanceγ′inFigure4.4b.Pointpb

isincludedtotheregionofc

byiwithinδb.So,V

Cγ′(Cyi)inconceptlayerwillgetthegraded

membershipvaluebetweenpb

andcbyi.But,pointp

aisnotincludedinany

oftheregionswithinδa.So,VQγ′(q

a1)andV

Qγ′(q

a2)inqualitylayerwillget

thegradedqualityvaluesbetweenpa

andtwoqualitydimensionsqa1andq

a2,

respectively.butpa

isnotincludedinanyoftheregionswithinδa.So,thesymbolvectorVγ′willgetvaluesatthreeindices:V

Cγ′(Cyi)inconceptlayer,

andVQγ′(q

a1)andV

Qγ′(q

a2)inqualitylayer.

Algorithm4.1showsthestepstosetthesymbolvectorvalueswhileiteratingthroughalltheinvolveddomainsΔ(γ′).Bothgradedmembershipfunctionandgradedqualityfunctionareformallydefinedinthefollowingsections.

GradedMembershipfunction

InSection3.2,asub-conceptcisdefinedasapairc:〈ηc,φc〉,whereηcistheconvexregionrepresentingthegeometricalareaofthecorrespondingcon-ceptindomainδ,andφcisasetofweightsshowingtheassigneddegreesof


Algorithm 4.1: Inference in Conceptual Space

Function ConceptualSpaceInference(γ ′,Q,Δ,C)foreach δi ∈ Δ(γ ′) do

p ← piγ′ ∈ γ ′

// Set the symbol vector in concept layerforeach c ∈ δi do

simγ′,Cy= max(simγ′,Cy

,G(p, c)) // Cy c

if simγ′,Cy= 0 // cases 1 and 3 (and partially 2)

thenlabelγ′,Cy

= label(Cy)else

labelγ′,Cy= ∅

VCγ′(Cy) = (labelγ′,Cy

, simγ′,Cy)

// Set the symbol vector in quality layerif VC

γ′(Cy) == (∅, 0) // case 4 (and partially 2)then

foreach q ∈ δi dodegreeγ′,q = H(p,q)labelγ′,q = label(Abest

γ′,q )

VQγ′(qi) ← (labelγ′,q,degreeγ′,q)

return Vγ′ : 〈 VCγ′ ,VQ

γ′ 〉

Example 4.4. Consider the instance γ ′ in Figure 4.4b. Point pb is included tothe region of cbyi

within δb. So, VCγ′(Cyi

) in concept layer will get the gradedmembership value between pb and cbyi

. But, point pa is not included in anyof the regions within δa. So, VQ

γ′(qa1 ) and VQ

γ′(qa2 ) in quality layer will get

the graded quality values between pa and two quality dimensions qa1 and qa

2 ,respectively. but pa is not included in any of the regions within δa. So, thesymbol vector Vγ′ will get values at three indices: VC

γ′(Cyi) in concept layer,

and VQγ′(qa

1 ) and VQγ′(qa

2 ) in quality layer.

Algorithm 4.1 shows the steps to set the symbol vector values while iteratingthrough all the involved domains Δ(γ ′). Both graded membership function andgraded quality function are formally defined in the following sections.

Graded Membership function

In Section 3.2, a sub-concept c is defined as a pair c : 〈ηc,φc〉, where ηc isthe convex region representing the geometrical area of the corresponding con-cept in domain δ, and φc is a set of weights showing the assigned degrees of


Algorithm4.1:InferenceinConceptualSpace

FunctionConceptualSpaceInference(γ′,Q,Δ,C)foreachδi∈Δ(γ′)do

p←piγ′∈γ′

//Setthesymbolvectorinconceptlayerforeachc∈δido

simγ′,Cy=max(simγ′,Cy,G(p,c))//Cy c

ifsimγ′,Cy=0//cases1and3(andpartially2)then

labelγ′,Cy=label(Cy)else

labelγ′,Cy=∅VCγ′(Cy)=(labelγ′,Cy,simγ′,Cy)

//SetthesymbolvectorinqualitylayerifV

Cγ′(Cy)==(∅,0)//case4(andpartially2)

thenforeachq∈δido

degreeγ′,q=H(p,q)labelγ′,q=label(A

bestγ′,q)

VQγ′(q

i)←(labelγ′,q,degreeγ′,q)

returnVγ′:〈VCγ′,V

Qγ′〉

Example4.4.Considertheinstanceγ′inFigure4.4b.Pointpb

isincludedtotheregionofc

byiwithinδb.So,V

Cγ′(Cyi)inconceptlayerwillgetthegraded

membershipvaluebetweenpb

andcbyi.But,pointp

aisnotincludedinany

oftheregionswithinδa.So,VQγ′(q

a1)andV

Qγ′(q

a2)inqualitylayerwillget

thegradedqualityvaluesbetweenpa

andtwoqualitydimensionsqa1andq

a2,

respectively.butpa

isnotincludedinanyoftheregionswithinδa.So,thesymbolvectorVγ′willgetvaluesatthreeindices:V

Cγ′(Cyi)inconceptlayer,

andVQγ′(q

a1)andV

Qγ′(q

a2)inqualitylayer.

Algorithm4.1showsthestepstosetthesymbolvectorvalueswhileiteratingthroughalltheinvolveddomainsΔ(γ′).Bothgradedmembershipfunctionandgradedqualityfunctionareformallydefinedinthefollowingsections.

GradedMembershipfunction

InSection3.2,asub-conceptcisdefinedasapairc:〈ηc,φc〉,whereηcistheconvexregionrepresentingthegeometricalareaofthecorrespondingcon-ceptindomainδ,andφcisasetofweightsshowingtheassigneddegreesof


salience between sub-concept and each quality dimension within the domain.The problem of inclusion has been studied in the literature of the conceptualspaces theory with various definitions such as inclusion operator [9], gradedsimilarity [92], and graded membership [69, 112]. These definitions calculatethe similarity of the instance points to the regions based on their geometricaldistances, with or without considering the gradedness of membership. Here,since the convex regions of concepts are constructed based on the observedinstances, it does not make sense to adhere to the crisp boundaries of the cal-culated regions rigidly. So, a point which is not certainly inside the calculatedboundaries a region, but is similar enough to the region, can be counted as amember of the region’s concept.

The Graded Membership function G(p, c) is defined as an inclusion operatorbetween a given point p and a defined sub-concept c : 〈ηc,φc〉. This functionshows how similar is p to the convex region ηc of c with the certain set ofweights φc within a metric domain δ. The graded membership is calculated byapplying geometrical algorithms that consider whether an n-dimensional pointis included in the n-dimensional convex hull or not. If point p is certainly in-cluded in the region ηc, then G(p, c) is equal to one. But if p is outside theregion, then the similarity of point p to the region is defined as a monotonicdecreasing function [69] which is measured using a fuzzy membership functionof distance sim(p, c) = f[d(p, c)]. This similarity takes values between [0, 1]and expresses the graded degree of inclusion of p in c. From the experimentson similarity cognition, the similarity can be measured as an exponential decayfunction of the distance: sim(d) = e−rd [202] (where r is a constant factor).Using a fuzzy membership function to measure the similarity has the advantageof using the notions from the fuzzy set theory that provide linguistic descrip-tions for the output fuzzy degrees [188].

Several methods have been proposed to compute the distance of a point p toa convex region ηc. The Hausdorff distance dH(p,ηc) [14] is a proper choice,which relies on the definition of a distance measure between two n-dimensionalpoints (namely Weighted Minkowski Metric). As a function of similarity mea-sure, a graded membership function is defined inspired by Hampton’s defini-tion [112] in which a determinate boundary region of membership is assumed.For the points inside the region, the membership value is one. For the pointsout of the region, with a given lower-bound threshold θL, if d(p,ηc) � θL, thenp is similar enough to be counted as a member of c, and if d(p,ηc) > θL, thenp is far to be counted as an instance of c [188]2.

2The definition of graded membership function in [69,112] is slightly different, where it is basedon three thresholds to define the lower, upper, and the middle level of the boundary regions.


saliencebetweensub-conceptandeachqualitydimensionwithinthedomain.Theproblemofinclusionhasbeenstudiedintheliteratureoftheconceptualspacestheorywithvariousdefinitionssuchasinclusionoperator[9],gradedsimilarity[92],andgradedmembership[69,112].Thesedefinitionscalculatethesimilarityoftheinstancepointstotheregionsbasedontheirgeometricaldistances,withorwithoutconsideringthegradednessofmembership.Here,sincetheconvexregionsofconceptsareconstructedbasedontheobservedinstances,itdoesnotmakesensetoadheretothecrispboundariesofthecal-culatedregionsrigidly.So,apointwhichisnotcertainlyinsidethecalculatedboundariesaregion,butissimilarenoughtotheregion,canbecountedasamemberoftheregion’sconcept.

TheGradedMembershipfunctionG(p,c)isdefinedasaninclusionoperatorbetweenagivenpointpandadefinedsub-conceptc:〈ηc,φc〉.Thisfunctionshowshowsimilarisptotheconvexregionηcofcwiththecertainsetofweightsφcwithinametricdomainδ.Thegradedmembershipiscalculatedbyapplyinggeometricalalgorithmsthatconsiderwhetherann-dimensionalpointisincludedinthen-dimensionalconvexhullornot.Ifpointpiscertainlyin-cludedintheregionηc,thenG(p,c)isequaltoone.Butifpisoutsidetheregion,thenthesimilarityofpointptotheregionisdefinedasamonotonicdecreasingfunction[69]whichismeasuredusingafuzzymembershipfunctionofdistancesim(p,c)=f[d(p,c)].Thissimilaritytakesvaluesbetween[0,1]andexpressesthegradeddegreeofinclusionofpinc.Fromtheexperimentsonsimilaritycognition,thesimilaritycanbemeasuredasanexponentialdecayfunctionofthedistance:sim(d)=e

−rd[202](whererisaconstantfactor).

Usingafuzzymembershipfunctiontomeasurethesimilarityhastheadvantageofusingthenotionsfromthefuzzysettheorythatprovidelinguisticdescrip-tionsfortheoutputfuzzydegrees[188].

Severalmethodshavebeenproposedtocomputethedistanceofapointptoaconvexregionηc.TheHausdorffdistanced

H(p,ηc)[14]isaproperchoice,

whichreliesonthedefinitionofadistancemeasurebetweentwon-dimensionalpoints(namelyWeightedMinkowskiMetric).Asafunctionofsimilaritymea-sure,agradedmembershipfunctionisdefinedinspiredbyHampton’sdefini-tion[112]inwhichadeterminateboundaryregionofmembershipisassumed.Forthepointsinsidetheregion,themembershipvalueisone.Forthepointsoutoftheregion,withagivenlower-boundthresholdθL,ifd(p,ηc)�θL,thenpissimilarenoughtobecountedasamemberofc,andifd(p,ηc)>θL,thenpisfartobecountedasaninstanceofc[188]2.

2Thedefinitionofgradedmembershipfunctionin[69,112]isslightlydifferent,whereitisbasedonthreethresholdstodefinethelower,upper,andthemiddleleveloftheboundaryregions.


salience between sub-concept and each quality dimension within the domain.The problem of inclusion has been studied in the literature of the conceptualspaces theory with various definitions such as inclusion operator [9], gradedsimilarity [92], and graded membership [69, 112]. These definitions calculatethe similarity of the instance points to the regions based on their geometricaldistances, with or without considering the gradedness of membership. Here,since the convex regions of concepts are constructed based on the observedinstances, it does not make sense to adhere to the crisp boundaries of the cal-culated regions rigidly. So, a point which is not certainly inside the calculatedboundaries a region, but is similar enough to the region, can be counted as amember of the region’s concept.

The Graded Membership function G(p, c) is defined as an inclusion operatorbetween a given point p and a defined sub-concept c : 〈ηc,φc〉. This functionshows how similar is p to the convex region ηc of c with the certain set ofweights φc within a metric domain δ. The graded membership is calculated byapplying geometrical algorithms that consider whether an n-dimensional pointis included in the n-dimensional convex hull or not. If point p is certainly in-cluded in the region ηc, then G(p, c) is equal to one. But if p is outside theregion, then the similarity of point p to the region is defined as a monotonicdecreasing function [69] which is measured using a fuzzy membership functionof distance sim(p, c) = f[d(p, c)]. This similarity takes values between [0, 1]and expresses the graded degree of inclusion of p in c. From the experimentson similarity cognition, the similarity can be measured as an exponential decayfunction of the distance: sim(d) = e−rd [202] (where r is a constant factor).Using a fuzzy membership function to measure the similarity has the advantageof using the notions from the fuzzy set theory that provide linguistic descrip-tions for the output fuzzy degrees [188].

Several methods have been proposed to compute the distance of a point p toa convex region ηc. The Hausdorff distance dH(p,ηc) [14] is a proper choice,which relies on the definition of a distance measure between two n-dimensionalpoints (namely Weighted Minkowski Metric). As a function of similarity mea-sure, a graded membership function is defined inspired by Hampton’s defini-tion [112] in which a determinate boundary region of membership is assumed.For the points inside the region, the membership value is one. For the pointsout of the region, with a given lower-bound threshold θL, if d(p,ηc) � θL, thenp is similar enough to be counted as a member of c, and if d(p,ηc) > θL, thenp is far to be counted as an instance of c [188]2.

2The definition of graded membership function in [69,112] is slightly different, where it is basedon three thresholds to define the lower, upper, and the middle level of the boundary regions.


saliencebetweensub-conceptandeachqualitydimensionwithinthedomain.Theproblemofinclusionhasbeenstudiedintheliteratureoftheconceptualspacestheorywithvariousdefinitionssuchasinclusionoperator[9],gradedsimilarity[92],andgradedmembership[69,112].Thesedefinitionscalculatethesimilarityoftheinstancepointstotheregionsbasedontheirgeometricaldistances,withorwithoutconsideringthegradednessofmembership.Here,sincetheconvexregionsofconceptsareconstructedbasedontheobservedinstances,itdoesnotmakesensetoadheretothecrispboundariesofthecal-culatedregionsrigidly.So,apointwhichisnotcertainlyinsidethecalculatedboundariesaregion,butissimilarenoughtotheregion,canbecountedasamemberoftheregion’sconcept.

TheGradedMembershipfunctionG(p,c)isdefinedasaninclusionoperatorbetweenagivenpointpandadefinedsub-conceptc:〈ηc,φc〉.Thisfunctionshowshowsimilarisptotheconvexregionηcofcwiththecertainsetofweightsφcwithinametricdomainδ.Thegradedmembershipiscalculatedbyapplyinggeometricalalgorithmsthatconsiderwhetherann-dimensionalpointisincludedinthen-dimensionalconvexhullornot.Ifpointpiscertainlyin-cludedintheregionηc,thenG(p,c)isequaltoone.Butifpisoutsidetheregion,thenthesimilarityofpointptotheregionisdefinedasamonotonicdecreasingfunction[69]whichismeasuredusingafuzzymembershipfunctionofdistancesim(p,c)=f[d(p,c)].Thissimilaritytakesvaluesbetween[0,1]andexpressesthegradeddegreeofinclusionofpinc.Fromtheexperimentsonsimilaritycognition,thesimilaritycanbemeasuredasanexponentialdecayfunctionofthedistance:sim(d)=e

−rd[202](whererisaconstantfactor).

Usingafuzzymembershipfunctiontomeasurethesimilarityhastheadvantageofusingthenotionsfromthefuzzysettheorythatprovidelinguisticdescrip-tionsfortheoutputfuzzydegrees[188].

Severalmethodshavebeenproposedtocomputethedistanceofapointptoaconvexregionηc.TheHausdorffdistanced

H(p,ηc)[14]isaproperchoice,

whichreliesonthedefinitionofadistancemeasurebetweentwon-dimensionalpoints(namelyWeightedMinkowskiMetric).Asafunctionofsimilaritymea-sure,agradedmembershipfunctionisdefinedinspiredbyHampton’sdefini-tion[112]inwhichadeterminateboundaryregionofmembershipisassumed.Forthepointsinsidetheregion,themembershipvalueisone.Forthepointsoutoftheregion,withagivenlower-boundthresholdθL,ifd(p,ηc)�θL,thenpissimilarenoughtobecountedasamemberofc,andifd(p,ηc)>θL,thenpisfartobecountedasaninstanceofc[188]2.

2Thedefinitionofgradedmembershipfunctionin[69,112]isslightlydifferent,whereitisbasedonthreethresholdstodefinethelower,upper,andthemiddleleveloftheboundaryregions.


Definition 4.2. The graded membership function G : δ → [0, 1] is the similaritymeasure between a point p and a sub-concept c ∈ Cy as:

G(p, c) =

⎧⎪⎨⎪⎩

1 if p ∈ ηc

e−rdH(p,ηc) if p /∈ ηc & dH(p,ηc) � θL

0 if p /∈ ηc & dH(p,ηc) > θL

(4.1)

The symbol vector of a new instance in the concept layer (VCγ′ ) is set by

the graded membership function by measuring the similarity of an instancepoint to each region of the sub-concept within the domains (Algorithm 4.1).As mentioned in Section 4.1, each vi ∈ Vγ consists of a pair of valuesvi = (label, value). The similarity values greater than zero will lead to as-sign a non-empty label to the corresponding concept’s index in symbol vector.Formally, for a given γ ′ and Cy, two elements (label, value) are calculated as:VCγ′(Cy) = (labelγ′,Cy

, simγ′,Cy), where

simγ′,Cy= maxpi∈γ′,cj∈Cy

(G(pi, cj)), (4.2)

and

labelγ′,Cy=

{label(Cy) if simγ,Cy

> 0∅ o.w

(4.3)

Example 4.5. Figures 4.4a, 4.4b, and 4.4c show the example values of thegraded membership function (G) calculated for the points pa and pb based ontheir positions and distances to the convex regions of the sub-concepts in thespace. For example in Figure 4.4c, suppose in δa, G(pa

γ′ , cayj) = 0.9. Then, the

elements of VCγ′(Cyj

) are set to (label(Cyj), 0.9).

Graded Quality Function

According to Algorithm 4.1, if a point p in a domain δ is not similar enoughto any sub-concept within δ, then the values of the symbol vector in the qual-ity layer will be set based on the graded value of p for each quality dimensionof δ. Recall from Chapter 3, a quality dimension q : 〈Hq, Iq,μq〉 contains afamily of membership functions μq, representing the linguistic terms relatedto graded values of q. In particular, this function is defined as a fuzzy gran-ulation to exploit the linguistic characterisation of feature values, which areidentified by prior knowledge. To formalise μq, fuzzy membership functionsare defined for a set of pre-defined label classes which forms a fuzzy partitionof the interval Iq [165]. Suppose Ai is a class (label) acquired for q (e.g., lin-guistic label tall for feature height). The corresponding fuzzy set is defined as:Ai = {(x,μAi

) | x ∈ Iq}, where μAiis a sigmoidal membership function with


Definition4.2.ThegradedmembershipfunctionG:δ→[0,1]isthesimilaritymeasurebetweenapointpandasub-conceptc∈Cyas:

G(p,c)=

⎧⎪⎨⎪⎩

1ifp∈ηc

e−rd

H(p,ηc)

ifp/∈ηc&dH(p,ηc)�θL

0ifp/∈ηc&dH(p,ηc)>θL

(4.1)

Thesymbolvectorofanewinstanceintheconceptlayer(VCγ′)issetby

thegradedmembershipfunctionbymeasuringthesimilarityofaninstancepointtoeachregionofthesub-conceptwithinthedomains(Algorithm4.1).AsmentionedinSection4.1,eachvi∈Vγconsistsofapairofvaluesvi=(label,value).Thesimilarityvaluesgreaterthanzerowillleadtoas-signanon-emptylabeltothecorrespondingconcept’sindexinsymbolvector.Formally,foragivenγ′andCy,twoelements(label,value)arecalculatedas:VCγ′(Cy)=(labelγ′,Cy,simγ′,Cy),where

simγ′,Cy=maxpi∈γ′,cj∈Cy(G(pi,cj)),(4.2)

and

labelγ′,Cy=

{label(Cy)ifsimγ,Cy>0∅o.w

(4.3)

Example4.5.Figures4.4a,4.4b,and4.4cshowtheexamplevaluesofthegradedmembershipfunction(G)calculatedforthepointsp

aandp

bbasedon

theirpositionsanddistancestotheconvexregionsofthesub-conceptsinthespace.ForexampleinFigure4.4c,supposeinδa,G(p

aγ′,c

ayj)=0.9.Then,the

elementsofVCγ′(Cyj)aresetto(label(Cyj),0.9).

GradedQualityFunction

AccordingtoAlgorithm4.1,ifapointpinadomainδisnotsimilarenoughtoanysub-conceptwithinδ,thenthevaluesofthesymbolvectorinthequal-itylayerwillbesetbasedonthegradedvalueofpforeachqualitydimensionofδ.RecallfromChapter3,aqualitydimensionq:〈Hq,Iq,μq〉containsafamilyofmembershipfunctionsμq,representingthelinguistictermsrelatedtogradedvaluesofq.Inparticular,thisfunctionisdefinedasafuzzygran-ulationtoexploitthelinguisticcharacterisationoffeaturevalues,whichareidentifiedbypriorknowledge.Toformaliseμq,fuzzymembershipfunctionsaredefinedforasetofpre-definedlabelclasseswhichformsafuzzypartitionoftheintervalIq[165].SupposeAiisaclass(label)acquiredforq(e.g.,lin-guisticlabeltallforfeatureheight).Thecorrespondingfuzzysetisdefinedas:Ai={(x,μAi)|x∈Iq},whereμAiisasigmoidalmembershipfunctionwith


Definition 4.2. The graded membership function G : δ → [0, 1] is the similaritymeasure between a point p and a sub-concept c ∈ Cy as:

G(p, c) =

⎧⎪⎨⎪⎩

1 if p ∈ ηc

e−rdH(p,ηc) if p /∈ ηc & dH(p,ηc) � θL

0 if p /∈ ηc & dH(p,ηc) > θL

(4.1)

The symbol vector of a new instance in the concept layer (VCγ′ ) is set by

the graded membership function by measuring the similarity of an instancepoint to each region of the sub-concept within the domains (Algorithm 4.1).As mentioned in Section 4.1, each vi ∈ Vγ consists of a pair of valuesvi = (label, value). The similarity values greater than zero will lead to as-sign a non-empty label to the corresponding concept’s index in symbol vector.Formally, for a given γ ′ and Cy, two elements (label, value) are calculated as:VCγ′(Cy) = (labelγ′,Cy

, simγ′,Cy), where

simγ′,Cy= maxpi∈γ′,cj∈Cy

(G(pi, cj)), (4.2)

and

labelγ′,Cy=

{label(Cy) if simγ,Cy

> 0∅ o.w

(4.3)

Example 4.5. Figures 4.4a, 4.4b, and 4.4c show the example values of thegraded membership function (G) calculated for the points pa and pb based ontheir positions and distances to the convex regions of the sub-concepts in thespace. For example in Figure 4.4c, suppose in δa, G(pa

γ′ , cayj) = 0.9. Then, the

elements of VCγ′(Cyj

) are set to (label(Cyj), 0.9).

Graded Quality Function

According to Algorithm 4.1, if a point p in a domain δ is not similar enoughto any sub-concept within δ, then the values of the symbol vector in the qual-ity layer will be set based on the graded value of p for each quality dimensionof δ. Recall from Chapter 3, a quality dimension q : 〈Hq, Iq,μq〉 contains afamily of membership functions μq, representing the linguistic terms relatedto graded values of q. In particular, this function is defined as a fuzzy gran-ulation to exploit the linguistic characterisation of feature values, which areidentified by prior knowledge. To formalise μq, fuzzy membership functionsare defined for a set of pre-defined label classes which forms a fuzzy partitionof the interval Iq [165]. Suppose Ai is a class (label) acquired for q (e.g., lin-guistic label tall for feature height). The corresponding fuzzy set is defined as:Ai = {(x,μAi

) | x ∈ Iq}, where μAiis a sigmoidal membership function with


Definition4.2.ThegradedmembershipfunctionG:δ→[0,1]isthesimilaritymeasurebetweenapointpandasub-conceptc∈Cyas:

G(p,c)=

⎧⎪⎨⎪⎩

1ifp∈ηc

e−rd

H(p,ηc)

ifp/∈ηc&dH(p,ηc)�θL

0ifp/∈ηc&dH(p,ηc)>θL

(4.1)

Thesymbolvectorofanewinstanceintheconceptlayer(VCγ′)issetby

thegradedmembershipfunctionbymeasuringthesimilarityofaninstancepointtoeachregionofthesub-conceptwithinthedomains(Algorithm4.1).AsmentionedinSection4.1,eachvi∈Vγconsistsofapairofvaluesvi=(label,value).Thesimilarityvaluesgreaterthanzerowillleadtoas-signanon-emptylabeltothecorrespondingconcept’sindexinsymbolvector.Formally,foragivenγ′andCy,twoelements(label,value)arecalculatedas:VCγ′(Cy)=(labelγ′,Cy,simγ′,Cy),where

simγ′,Cy=maxpi∈γ′,cj∈Cy(G(pi,cj)),(4.2)

and

labelγ′,Cy=

{label(Cy)ifsimγ,Cy>0∅o.w

(4.3)

Example4.5.Figures4.4a,4.4b,and4.4cshowtheexamplevaluesofthegradedmembershipfunction(G)calculatedforthepointsp

aandp

bbasedon

theirpositionsanddistancestotheconvexregionsofthesub-conceptsinthespace.ForexampleinFigure4.4c,supposeinδa,G(p

aγ′,c

ayj)=0.9.Then,the

elementsofVCγ′(Cyj)aresetto(label(Cyj),0.9).

GradedQualityFunction

AccordingtoAlgorithm4.1,ifapointpinadomainδisnotsimilarenoughtoanysub-conceptwithinδ,thenthevaluesofthesymbolvectorinthequal-itylayerwillbesetbasedonthegradedvalueofpforeachqualitydimensionofδ.RecallfromChapter3,aqualitydimensionq:〈Hq,Iq,μq〉containsafamilyofmembershipfunctionsμq,representingthelinguistictermsrelatedtogradedvaluesofq.Inparticular,thisfunctionisdefinedasafuzzygran-ulationtoexploitthelinguisticcharacterisationoffeaturevalues,whichareidentifiedbypriorknowledge.Toformaliseμq,fuzzymembershipfunctionsaredefinedforasetofpre-definedlabelclasseswhichformsafuzzypartitionoftheintervalIq[165].SupposeAiisaclass(label)acquiredforq(e.g.,lin-guisticlabeltallforfeatureheight).Thecorrespondingfuzzysetisdefinedas:Ai={(x,μAi)|x∈Iq},whereμAiisasigmoidalmembershipfunctionwith


certain parameters to define the lower and upper boundaries of the function3.Then, μq = {μA1 , . . . ,μAn

}.

Example 4.6. Consider the elongation, described in Example 3.2. A set of labelclasses to describe the elongation is { Acir =‘circular’, Aelp =‘elliptical’, andAeld =‘elongated’}. Then the family of membership functions for qel is μqel ={μAcir ,μAelp ,μAeld }. Figure 4.5 depicts the defined membership functions for theelongation quality dimension.

This linguistic mapping provides a symbolic representation for numeric in-terval values of the quality dimensions. The graded quality value of an instanceγ ′ for a quality dimension q is calculated based on the quality dimension valueof the instance point as pq = pγ′(q), where pq ∈ Iq. Using the defined fuzzymembership functions, pq is mapped into the fuzzy set best matching to thegiven value. Using the functions in μ, the values of the symbol vector can beset in the quality layer. Recall p = 〈pq1 , . . . ,pq|Q(δ)|

〉 as the vector of qualitydimension values for the point p in δ.

Definition 4.3. Graded quality function H : Iq → [0, 1] is the degree of mem-bership, wherein for a quality dimension q, it returns the maximum degree ofmembership of pq using the membership functions in μq, as:

H(p,q) = maxμAi∈μq

μAi(pq) (4.4)

The symbol vector of a new instance in the quality layer (VQγ′ ) is filled with

the values of the graded quality function (Algorithm 4.1). Similar to the con-cept layer, each vi ∈ Vγ consists of a pair of values vi = (label, value).The value of the graded quality function, which is the maximum degree ofmembership of vi assigns the best match fuzzy subset (i.e., symbolic label)to the corresponding quality dimension’s index in symbol vector. Formally,for a given γ ′ and Cy, two elements (label, value) of are calculated as:VQγ′(q) = (labelγ′,q,degreeγ′,q), where

degreeγ′,q = H(p,q), (4.5)

and

labelγ′,q = label(Abestγ′,q ). (4.6)

Here, Abestγ′,q is the fuzzy subset with the maximum degree of membership

such that μAbestγ′ ,q

∈ μq and ∀(μAi∈ μq) : μAbest

γ′ ,q(pq) � μAi

(pq).

Example 4.7. Considering the functions μqel in Example 4.6 (Figure 4.5), as-sume that for a given instance point pγ′ , its elongation value is pq = 0.75. So,

3Sigmoidal membership function is defined as: f(x,a,c) = 11+e−a(x−c)


certainparameterstodefinethelowerandupperboundariesofthefunction3.Then,μq={μA1,...,μAn}.

Example4.6.Considertheelongation,describedinExample3.2.Asetoflabelclassestodescribetheelongationis{Acir=‘circular’,Aelp=‘elliptical’,andAeld=‘elongated’}.Thenthefamilyofmembershipfunctionsforqelisμqel={μAcir,μAelp,μAeld}.Figure4.5depictsthedefinedmembershipfunctionsfortheelongationqualitydimension.

Thislinguisticmappingprovidesasymbolicrepresentationfornumericin-tervalvaluesofthequalitydimensions.Thegradedqualityvalueofaninstanceγ′foraqualitydimensionqiscalculatedbasedonthequalitydimensionvalueoftheinstancepointaspq=pγ′(q),wherepq∈Iq.Usingthedefinedfuzzymembershipfunctions,pqismappedintothefuzzysetbestmatchingtothegivenvalue.Usingthefunctionsinμ,thevaluesofthesymbolvectorcanbesetinthequalitylayer.Recallp=〈pq1,...,pq|Q(δ)|〉asthevectorofqualitydimensionvaluesforthepointpinδ.

Definition4.3.GradedqualityfunctionH:Iq→[0,1]isthedegreeofmem-bership,whereinforaqualitydimensionq,itreturnsthemaximumdegreeofmembershipofpqusingthemembershipfunctionsinμq,as:

H(p,q)=maxμAi∈μqμAi(pq)(4.4)

Thesymbolvectorofanewinstanceinthequalitylayer(VQγ′)isfilledwith

thevaluesofthegradedqualityfunction(Algorithm4.1).Similartothecon-ceptlayer,eachvi∈Vγconsistsofapairofvaluesvi=(label,value).Thevalueofthegradedqualityfunction,whichisthemaximumdegreeofmembershipofviassignsthebestmatchfuzzysubset(i.e.,symboliclabel)tothecorrespondingqualitydimension’sindexinsymbolvector.Formally,foragivenγ′andCy,twoelements(label,value)ofarecalculatedas:VQγ′(q)=(labelγ′,q,degreeγ′,q),where

degreeγ′,q=H(p,q),(4.5)

and

labelγ′,q=label(Abestγ′,q).(4.6)

Here,Abestγ′,qisthefuzzysubsetwiththemaximumdegreeofmembership

suchthatμAbestγ′,q∈μqand∀(μAi∈μq):μA

bestγ′,q

(pq)�μAi(pq).

Example4.7.ConsideringthefunctionsμqelinExample4.6(Figure4.5),as-sumethatforagiveninstancepointpγ′,itselongationvalueispq=0.75.So,

3Sigmoidalmembershipfunctionisdefinedas:f(x,a,c)=11+e−a(x−c)


certain parameters to define the lower and upper boundaries of the function3.Then, μq = {μA1 , . . . ,μAn

}.

Example 4.6. Consider the elongation, described in Example 3.2. A set of labelclasses to describe the elongation is { Acir =‘circular’, Aelp =‘elliptical’, andAeld =‘elongated’}. Then the family of membership functions for qel is μqel ={μAcir ,μAelp ,μAeld }. Figure 4.5 depicts the defined membership functions for theelongation quality dimension.

This linguistic mapping provides a symbolic representation for numeric in-terval values of the quality dimensions. The graded quality value of an instanceγ ′ for a quality dimension q is calculated based on the quality dimension valueof the instance point as pq = pγ′(q), where pq ∈ Iq. Using the defined fuzzymembership functions, pq is mapped into the fuzzy set best matching to thegiven value. Using the functions in μ, the values of the symbol vector can beset in the quality layer. Recall p = 〈pq1 , . . . ,pq|Q(δ)|

〉 as the vector of qualitydimension values for the point p in δ.

Definition 4.3. Graded quality function H : Iq → [0, 1] is the degree of mem-bership, wherein for a quality dimension q, it returns the maximum degree ofmembership of pq using the membership functions in μq, as:

H(p,q) = maxμAi∈μq

μAi(pq) (4.4)

The symbol vector of a new instance in the quality layer (VQγ′ ) is filled with

the values of the graded quality function (Algorithm 4.1). Similar to the con-cept layer, each vi ∈ Vγ consists of a pair of values vi = (label, value).The value of the graded quality function, which is the maximum degree ofmembership of vi assigns the best match fuzzy subset (i.e., symbolic label)to the corresponding quality dimension’s index in symbol vector. Formally,for a given γ ′ and Cy, two elements (label, value) of are calculated as:VQγ′(q) = (labelγ′,q,degreeγ′,q), where

degreeγ′,q = H(p,q), (4.5)

and

labelγ′,q = label(Abestγ′,q ). (4.6)

Here, Abestγ′,q is the fuzzy subset with the maximum degree of membership

such that μAbestγ′ ,q

∈ μq and ∀(μAi∈ μq) : μAbest

γ′ ,q(pq) � μAi

(pq).

Example 4.7. Considering the functions μqel in Example 4.6 (Figure 4.5), as-sume that for a given instance point pγ′ , its elongation value is pq = 0.75. So,

3Sigmoidal membership function is defined as: f(x,a,c) = 11+e−a(x−c)


certainparameterstodefinethelowerandupperboundariesofthefunction3.Then,μq={μA1,...,μAn}.

Example4.6.Considertheelongation,describedinExample3.2.Asetoflabelclassestodescribetheelongationis{Acir=‘circular’,Aelp=‘elliptical’,andAeld=‘elongated’}.Thenthefamilyofmembershipfunctionsforqelisμqel={μAcir,μAelp,μAeld}.Figure4.5depictsthedefinedmembershipfunctionsfortheelongationqualitydimension.

Thislinguisticmappingprovidesasymbolicrepresentationfornumericin-tervalvaluesofthequalitydimensions.Thegradedqualityvalueofaninstanceγ′foraqualitydimensionqiscalculatedbasedonthequalitydimensionvalueoftheinstancepointaspq=pγ′(q),wherepq∈Iq.Usingthedefinedfuzzymembershipfunctions,pqismappedintothefuzzysetbestmatchingtothegivenvalue.Usingthefunctionsinμ,thevaluesofthesymbolvectorcanbesetinthequalitylayer.Recallp=〈pq1,...,pq|Q(δ)|〉asthevectorofqualitydimensionvaluesforthepointpinδ.

Definition4.3.GradedqualityfunctionH:Iq→[0,1]isthedegreeofmem-bership,whereinforaqualitydimensionq,itreturnsthemaximumdegreeofmembershipofpqusingthemembershipfunctionsinμq,as:

H(p,q)=maxμAi∈μqμAi(pq)(4.4)

Thesymbolvectorofanewinstanceinthequalitylayer(VQγ′)isfilledwith

thevaluesofthegradedqualityfunction(Algorithm4.1).Similartothecon-ceptlayer,eachvi∈Vγconsistsofapairofvaluesvi=(label,value).Thevalueofthegradedqualityfunction,whichisthemaximumdegreeofmembershipofviassignsthebestmatchfuzzysubset(i.e.,symboliclabel)tothecorrespondingqualitydimension’sindexinsymbolvector.Formally,foragivenγ′andCy,twoelements(label,value)ofarecalculatedas:VQγ′(q)=(labelγ′,q,degreeγ′,q),where

degreeγ′,q=H(p,q),(4.5)

and

labelγ′,q=label(Abestγ′,q).(4.6)

Here,Abestγ′,qisthefuzzysubsetwiththemaximumdegreeofmembership

suchthatμAbestγ′,q∈μqand∀(μAi∈μq):μA

bestγ′,q

(pq)�μAi(pq).

Example4.7.ConsideringthefunctionsμqelinExample4.6(Figure4.5),as-sumethatforagiveninstancepointpγ′,itselongationvalueispq=0.75.So,

3Sigmoidalmembershipfunctionisdefinedas:f(x,a,c)=11+e−a(x−c)


Figure 4.5: Example of the membership functions of the linguistic terms circu-lar, elliptical, and elongated, describing the elongation quality dimension.

μAcir(pq) = 0, μAelp(vq) = 0.15, and μAeld(pq) = 0.9. Then, H(pγ′ ,qel) =

0.9, and Abestγ′,qel

= Aeld. Thus, one can say that “the given instance is elongatedto a degree 0.9”.

Example 4.8. Figures 4.4b and 4.4d show the example values of graded qualityfunction (H) calculated for the points pa and pb based on their values relatedto the quality dimensions of two domains. For example in Figure 4.4b, supposepa is not included in any region in δa. Then, two vector indices of the symbolvector in the quality layer get values as: VQ

γ′(qa1 ) = (label(Abest

γ′,qa1), 0.7) and

VQγ′(qa

2 ) = (label(Abestγ′,qa

2), 0.6).

4.2.2 Phase B: Inference in Symbol Space (Lexicalisation and

Realisation)

In this phase, the aim is to infer a linguistic lexicon from the symbol vectorof an unknown instance, to generate descriptions in natural language form.From the NLG perspective, this is the task of lexicalisation, that decides whichlinguistic terms (i.e., natural words) should be selected from the determinedcontent [182]. This can be done by verbalising the linguistic labels that are cal-culated and stored in the symbol vector. This verbalisation is done either withannotating a new instance via the concept labels, or with characterising theinstance via the quality dimension labels. The tasks of annotation and char-acterisation will assign a set of lexical items to an unknown observation. Thiscollection of linguistic terms is then turned to the natural language phrases (i.e.,sentences) using the realisation tools in NLG systems. Algorithm 4.2 shows thesteps of the tasks in phase B.

Annotation in the Concept Layer

Annotation for an instance γ ′ is to annotate a set of linguistic labels whichare derived from the associated concepts in the concept layer of symbol vec-


Figure4.5:Exampleofthemembershipfunctionsofthelinguistictermscircu-lar,elliptical,andelongated,describingtheelongationqualitydimension.

μAcir(pq)=0,μAelp(vq)=0.15,andμAeld(pq)=0.9.Then,H(pγ′,qel)=

0.9,andAbestγ′,qel=Aeld.Thus,onecansaythat“thegiveninstanceiselongated

toadegree0.9”.

Example4.8.Figures4.4band4.4dshowtheexamplevaluesofgradedqualityfunction(H)calculatedforthepointsp

aandp

bbasedontheirvaluesrelated

tothequalitydimensionsoftwodomains.ForexampleinFigure4.4b,supposepa

isnotincludedinanyregioninδa.Then,twovectorindicesofthesymbolvectorinthequalitylayergetvaluesas:V

Qγ′(q

a1)=(label(A

bestγ′,qa

1),0.7)and

VQγ′(q

a2)=(label(A

bestγ′,qa

2),0.6).

4.2.2PhaseB:InferenceinSymbolSpace(Lexicalisationand

Realisation)

Inthisphase,theaimistoinferalinguisticlexiconfromthesymbolvectorofanunknowninstance,togeneratedescriptionsinnaturallanguageform.FromtheNLGperspective,thisisthetaskoflexicalisation,thatdecideswhichlinguisticterms(i.e.,naturalwords)shouldbeselectedfromthedeterminedcontent[182].Thiscanbedonebyverbalisingthelinguisticlabelsthatarecal-culatedandstoredinthesymbolvector.Thisverbalisationisdoneeitherwithannotatinganewinstanceviatheconceptlabels,orwithcharacterisingtheinstanceviathequalitydimensionlabels.Thetasksofannotationandchar-acterisationwillassignasetoflexicalitemstoanunknownobservation.Thiscollectionoflinguistictermsisthenturnedtothenaturallanguagephrases(i.e.,sentences)usingtherealisationtoolsinNLGsystems.Algorithm4.2showsthestepsofthetasksinphaseB.

AnnotationintheConceptLayer

Annotationforaninstanceγ′istoannotateasetoflinguisticlabelswhicharederivedfromtheassociatedconceptsintheconceptlayerofsymbolvec-


Figure 4.5: Example of the membership functions of the linguistic terms circu-lar, elliptical, and elongated, describing the elongation quality dimension.

μAcir(pq) = 0, μAelp(vq) = 0.15, and μAeld(pq) = 0.9. Then, H(pγ′ ,qel) =

0.9, and Abestγ′,qel

= Aeld. Thus, one can say that “the given instance is elongatedto a degree 0.9”.

Example 4.8. Figures 4.4b and 4.4d show the example values of graded qualityfunction (H) calculated for the points pa and pb based on their values relatedto the quality dimensions of two domains. For example in Figure 4.4b, supposepa is not included in any region in δa. Then, two vector indices of the symbolvector in the quality layer get values as: VQ

γ′(qa1 ) = (label(Abest

γ′,qa1), 0.7) and

VQγ′(qa

2 ) = (label(Abestγ′,qa

2), 0.6).

4.2.2 Phase B: Inference in Symbol Space (Lexicalisation and

Realisation)

In this phase, the aim is to infer a linguistic lexicon from the symbol vectorof an unknown instance, to generate descriptions in natural language form.From the NLG perspective, this is the task of lexicalisation, that decides whichlinguistic terms (i.e., natural words) should be selected from the determinedcontent [182]. This can be done by verbalising the linguistic labels that are cal-culated and stored in the symbol vector. This verbalisation is done either withannotating a new instance via the concept labels, or with characterising theinstance via the quality dimension labels. The tasks of annotation and char-acterisation will assign a set of lexical items to an unknown observation. Thiscollection of linguistic terms is then turned to the natural language phrases (i.e.,sentences) using the realisation tools in NLG systems. Algorithm 4.2 shows thesteps of the tasks in phase B.

Annotation in the Concept Layer

Annotation for an instance γ ′ is to annotate a set of linguistic labels whichare derived from the associated concepts in the concept layer of symbol vec-


Figure4.5:Exampleofthemembershipfunctionsofthelinguistictermscircu-lar,elliptical,andelongated,describingtheelongationqualitydimension.

μAcir(pq)=0,μAelp(vq)=0.15,andμAeld(pq)=0.9.Then,H(pγ′,qel)=

0.9,andAbestγ′,qel=Aeld.Thus,onecansaythat“thegiveninstanceiselongated

toadegree0.9”.

Example4.8.Figures4.4band4.4dshowtheexamplevaluesofgradedqualityfunction(H)calculatedforthepointsp

aandp

bbasedontheirvaluesrelated

tothequalitydimensionsoftwodomains.ForexampleinFigure4.4b,supposepa

isnotincludedinanyregioninδa.Then,twovectorindicesofthesymbolvectorinthequalitylayergetvaluesas:V

Qγ′(q

a1)=(label(A

bestγ′,qa

1),0.7)and

VQγ′(q

a2)=(label(A

bestγ′,qa

2),0.6).

4.2.2PhaseB:InferenceinSymbolSpace(Lexicalisationand

Realisation)

Inthisphase,theaimistoinferalinguisticlexiconfromthesymbolvectorofanunknowninstance,togeneratedescriptionsinnaturallanguageform.FromtheNLGperspective,thisisthetaskoflexicalisation,thatdecideswhichlinguisticterms(i.e.,naturalwords)shouldbeselectedfromthedeterminedcontent[182].Thiscanbedonebyverbalisingthelinguisticlabelsthatarecal-culatedandstoredinthesymbolvector.Thisverbalisationisdoneeitherwithannotatinganewinstanceviatheconceptlabels,orwithcharacterisingtheinstanceviathequalitydimensionlabels.Thetasksofannotationandchar-acterisationwillassignasetoflexicalitemstoanunknownobservation.Thiscollectionoflinguistictermsisthenturnedtothenaturallanguagephrases(i.e.,sentences)usingtherealisationtoolsinNLGsystems.Algorithm4.2showsthestepsofthetasksinphaseB.

AnnotationintheConceptLayer

Annotationforaninstanceγ′istoannotateasetoflinguisticlabelswhicharederivedfromtheassociatedconceptsintheconceptlayerofsymbolvec-


Algorithm 4.2: Inference in Symbol Space

Function SymbolSpaceInference(Vγ′ )// Annotation in the concept layerforeach Cy ∈ C do

if Vγ′,C(Cy) = (∅, 0) thenC(γ ′) ← C(γ ′) ∪ {Cy}

OC(γ′) ← OrderConceptLabels(C(γ ′))

TC(γ′) ← Annotate(OC(γ

′),Vγ′,C) // in concept layer// Characterisation in the quality layerforeach δ /∈ Δ(C(γ ′)) do

Q(γ ′) ← Q(γ ′) ∪ {Q(δ)}

OQ(γ′) ← OrderQualityLabels(Q(γ ′))

TQ(γ′) ← Characterise(Q(γ ′),Vγ′,Q) // in quality layer

// Linguistic RealisationTγ′ ← Realise(TC(γ

′),TQ(γ′))

return Tγ′

tor Vγ′,C. Each Cy is included in the set of associated concepts of the in-stance, C(γ ′), if the corresponding element in Vγ′,C is not empty. Formally,Cy ∈ C(γ ′) ⇐⇒ Vγ′,C(Cy) = (∅, 0).

Example 4.9. Considering Figures 4.4a and 4.4b, γ ′ is associated with onlyone concept Cyi

. So, C(γ ′) = {Cyi}. In Figure 4.4c, γ ′ is associated with two

concepts Cyiand Cyj

. So, C(γ ′) = {Cyi,Cyj

}. Finally, in Figure 4.4d, there areno associated concepts. So, C(γ ′) = ∅.

After determining C(γ ′), if γ ′ is associated with two or more distinct con-cepts as C(γ ′) = {Cyi

,Cyj, . . .}, then γ ′ is an instance of all the associated

concepts. In this case, an extra process is needed to sort and combine the con-cept labels to annotate γ ′ with a new set of linguistic labels.

The task of concept combination is discussed widely in the literature ofconceptual space theory [86, 142, 188]. What is important for the inference isto distinguish which concept labels are the modifiers and which are the mod-ified concepts. This distinction leads to order the labels in the final linguis-tic expression [9]. In particular, an ordered set4 of the associated concepts,OC(γ

′) = {C ′1,C ′

2, . . .} is defined to prioritise the modifier concepts over mod-ified ones. Since there is no background knowledge to define the semantic or-der of the associated concepts, the ordering process relies based on the gradedmembership values that can be retrieved from Vγ′,C.

4In the set theory, an ordered set is defined as a set of elements, plus a relation � between eachpair of the elements that presents the order of them [229].


Algorithm4.2:InferenceinSymbolSpace

FunctionSymbolSpaceInference(Vγ′)//Annotationintheconceptlayer

foreachCy∈CdoifVγ′,C(Cy)=(∅,0)then

C(γ′)←C(γ′)∪{Cy}

OC(γ′)←OrderConceptLabels(C(γ′))TC(γ′)←Annotate(OC(γ′),Vγ′,C)//inconceptlayer//Characterisationinthequalitylayer

foreachδ/∈Δ(C(γ′))doQ(γ′)←Q(γ′)∪{Q(δ)}

OQ(γ′)←OrderQualityLabels(Q(γ′))TQ(γ′)←Characterise(Q(γ′),Vγ′,Q)//inqualitylayer//LinguisticRealisationTγ′←Realise(TC(γ′),TQ(γ′))

returnTγ′

torVγ′,C.EachCyisincludedinthesetofassociatedconceptsofthein-stance,C(γ′),ifthecorrespondingelementinVγ′,Cisnotempty.Formally,Cy∈C(γ′)⇐⇒Vγ′,C(Cy)=(∅,0).

Example4.9.ConsideringFigures4.4aand4.4b,γ′isassociatedwithonlyoneconceptCyi.So,C(γ′)={Cyi}.InFigure4.4c,γ′isassociatedwithtwoconceptsCyiandCyj.So,C(γ′)={Cyi,Cyj}.Finally,inFigure4.4d,therearenoassociatedconcepts.So,C(γ′)=∅.

AfterdeterminingC(γ′),ifγ′isassociatedwithtwoormoredistinctcon-ceptsasC(γ′)={Cyi,Cyj,...},thenγ′isaninstanceofalltheassociatedconcepts.Inthiscase,anextraprocessisneededtosortandcombinethecon-ceptlabelstoannotateγ′withanewsetoflinguisticlabels.

Thetaskofconceptcombinationisdiscussedwidelyintheliteratureofconceptualspacetheory[86,142,188].Whatisimportantfortheinferenceistodistinguishwhichconceptlabelsarethemodifiersandwhicharethemod-ifiedconcepts.Thisdistinctionleadstoorderthelabelsinthefinallinguis-ticexpression[9].Inparticular,anorderedset4oftheassociatedconcepts,OC(γ′)={C′

1,C′2,...}isdefinedtoprioritisethemodifierconceptsovermod-

ifiedones.Sincethereisnobackgroundknowledgetodefinethesemanticor-deroftheassociatedconcepts,theorderingprocessreliesbasedonthegradedmembershipvaluesthatcanberetrievedfromVγ′,C.

4Inthesettheory,anorderedsetisdefinedasasetofelements,plusarelation�betweeneachpairoftheelementsthatpresentstheorderofthem[229].


Algorithm 4.2: Inference in Symbol Space

Function SymbolSpaceInference(Vγ′ )// Annotation in the concept layerforeach Cy ∈ C do

if Vγ′,C(Cy) = (∅, 0) thenC(γ ′) ← C(γ ′) ∪ {Cy}

OC(γ′) ← OrderConceptLabels(C(γ ′))

TC(γ′) ← Annotate(OC(γ

′),Vγ′,C) // in concept layer// Characterisation in the quality layerforeach δ /∈ Δ(C(γ ′)) do

Q(γ ′) ← Q(γ ′) ∪ {Q(δ)}

OQ(γ′) ← OrderQualityLabels(Q(γ ′))

TQ(γ′) ← Characterise(Q(γ ′),Vγ′,Q) // in quality layer

// Linguistic RealisationTγ′ ← Realise(TC(γ

′),TQ(γ′))

return Tγ′

tor Vγ′,C. Each Cy is included in the set of associated concepts of the in-stance, C(γ ′), if the corresponding element in Vγ′,C is not empty. Formally,Cy ∈ C(γ ′) ⇐⇒ Vγ′,C(Cy) = (∅, 0).

Example 4.9. Considering Figures 4.4a and 4.4b, γ ′ is associated with onlyone concept Cyi

. So, C(γ ′) = {Cyi}. In Figure 4.4c, γ ′ is associated with two

concepts Cyiand Cyj

. So, C(γ ′) = {Cyi,Cyj

}. Finally, in Figure 4.4d, there areno associated concepts. So, C(γ ′) = ∅.

After determining C(γ ′), if γ ′ is associated with two or more distinct con-cepts as C(γ ′) = {Cyi

,Cyj, . . .}, then γ ′ is an instance of all the associated

concepts. In this case, an extra process is needed to sort and combine the con-cept labels to annotate γ ′ with a new set of linguistic labels.

The task of concept combination is discussed widely in the literature ofconceptual space theory [86, 142, 188]. What is important for the inference isto distinguish which concept labels are the modifiers and which are the mod-ified concepts. This distinction leads to order the labels in the final linguis-tic expression [9]. In particular, an ordered set4 of the associated concepts,OC(γ

′) = {C ′1,C ′

2, . . .} is defined to prioritise the modifier concepts over mod-ified ones. Since there is no background knowledge to define the semantic or-der of the associated concepts, the ordering process relies based on the gradedmembership values that can be retrieved from Vγ′,C.

4In the set theory, an ordered set is defined as a set of elements, plus a relation � between eachpair of the elements that presents the order of them [229].


Algorithm4.2:InferenceinSymbolSpace

FunctionSymbolSpaceInference(Vγ′)//Annotationintheconceptlayer

foreachCy∈CdoifVγ′,C(Cy)=(∅,0)then

C(γ′)←C(γ′)∪{Cy}

OC(γ′)←OrderConceptLabels(C(γ′))TC(γ′)←Annotate(OC(γ′),Vγ′,C)//inconceptlayer//Characterisationinthequalitylayer

foreachδ/∈Δ(C(γ′))doQ(γ′)←Q(γ′)∪{Q(δ)}

OQ(γ′)←OrderQualityLabels(Q(γ′))TQ(γ′)←Characterise(Q(γ′),Vγ′,Q)//inqualitylayer//LinguisticRealisationTγ′←Realise(TC(γ′),TQ(γ′))

returnTγ′

torVγ′,C.EachCyisincludedinthesetofassociatedconceptsofthein-stance,C(γ′),ifthecorrespondingelementinVγ′,Cisnotempty.Formally,Cy∈C(γ′)⇐⇒Vγ′,C(Cy)=(∅,0).

Example4.9.ConsideringFigures4.4aand4.4b,γ′isassociatedwithonlyoneconceptCyi.So,C(γ′)={Cyi}.InFigure4.4c,γ′isassociatedwithtwoconceptsCyiandCyj.So,C(γ′)={Cyi,Cyj}.Finally,inFigure4.4d,therearenoassociatedconcepts.So,C(γ′)=∅.

AfterdeterminingC(γ′),ifγ′isassociatedwithtwoormoredistinctcon-ceptsasC(γ′)={Cyi,Cyj,...},thenγ′isaninstanceofalltheassociatedconcepts.Inthiscase,anextraprocessisneededtosortandcombinethecon-ceptlabelstoannotateγ′withanewsetoflinguisticlabels.

Thetaskofconceptcombinationisdiscussedwidelyintheliteratureofconceptualspacetheory[86,142,188].Whatisimportantfortheinferenceistodistinguishwhichconceptlabelsarethemodifiersandwhicharethemod-ifiedconcepts.Thisdistinctionleadstoorderthelabelsinthefinallinguis-ticexpression[9].Inparticular,anorderedset4oftheassociatedconcepts,OC(γ′)={C′

1,C′2,...}isdefinedtoprioritisethemodifierconceptsovermod-

ifiedones.Sincethereisnobackgroundknowledgetodefinethesemanticor-deroftheassociatedconcepts,theorderingprocessreliesbasedonthegradedmembershipvaluesthatcanberetrievedfromVγ′,C.

4Inthesettheory,anorderedsetisdefinedasasetofelements,plusarelation�betweeneachpairoftheelementsthatpresentstheorderofthem[229].


Example 4.10. Considering the case three in Figure 4.4c, since γ ′ is located insub-concept cbyi

with graded membership 1 and in sub-concept cayjwith graded

membership 0.9, then the corresponding ordered set of concepts for γ ′ will be:OC(γ

′) = {Cyi,Cyj

}. Suppose there is an unknown instance γ ′ in a conceptualspace including Colour and Taste domains. If γ ′ is located in both ’red’ sub-concept with grading degree 0.95 and ’sweet’ sub-concept with grading degree0.7 (in colour and taste domains, respectively), then the corresponding orderedset of concepts is: OC(γ

′) = {Cred,Csweet}.

The set of annotations for γ ′ then is defined as an ordered set of lexical itemsTC(γ

′) = {label(C ′1), label(C

′2), . . .}. These annotations are the corresponding

linguistic terms to the ordered concepts in OC(γ′), that are retrieved from the

labels in Vγ′,C for the concepts in C(γ ′).

Example 4.11. For the conceptual space of leaves presented in Example 3.5,suppose an unknown leaf sample γ ′ is associated with both known con-cepts Tilia and Nerium. Assume that Vγ′,C(Ctt) = (label(Ctt), 0.5) andVγ′,C(Cno) = (label(Cno), 0.9). Then, TC(γ

′) = {label(Cno) = ‘Nerium’,label(Ctt) = ‘somewhat Tilia’}.

Characterisation in the Quality Layer

Characterisation for γ ′ is to assign the linguistic descriptions of the associatedquality dimension based on the values in the quality layer of the symbol vectorVγ′,Q. The motivation behind the characterisation comes from the lack of theconcept annotation in the cases with no associated concept within the domains(like case four and partially case two in Figure 4.4). This is especially importantfor those instances that are completely unknown for the systems and are notrepresentable by any of the defined concepts, but still are explainable with theirquality dimensions’ values.

According to Algorithm 4.1, if γ ′ within a domain does not belong to anysub-concept, then Vγ′,Q gets values from the domain’s quality dimensions. Ob-viously, if γ ′ has even one associated concept within the domain, there is noneed to involve the quality dimensions of that domain in the characterisationprocess. For the calculated Vγ′,Q, each quality dimension q is included in theset of associated quality dimensions Q(γ ′), if the corresponding elements in theVγ′,Q are not empty. Formally, q ∈ Q(γ ′) ⇐⇒ Vγ′,Q(q) = (∅, 0).

Example 4.12. Considering Figure 4.4b, γ ′ is not associated with any con-cepts within δa. So, Q(γ ′) = {qa

1 ,qa2 }. Also, in Figure 4.4d, there are no as-

sociated concepts in any of the domains. So, Q(γ ′) = {qa1 ,qa

2 ,qb1 ,qb

2 ,qb3 }. In

Figures 4.4a and 4.4c, Q(γ ′) = ∅.


Example4.10.ConsideringthecasethreeinFigure4.4c,sinceγ′islocatedinsub-conceptc

byiwithgradedmembership1andinsub-conceptc

ayjwithgraded

membership0.9,thenthecorrespondingorderedsetofconceptsforγ′willbe:OC(γ′)={Cyi,Cyj}.Supposethereisanunknowninstanceγ′inaconceptualspaceincludingColourandTastedomains.Ifγ′islocatedinboth’red’sub-conceptwithgradingdegree0.95and’sweet’sub-conceptwithgradingdegree0.7(incolourandtastedomains,respectively),thenthecorrespondingorderedsetofconceptsis:OC(γ′)={Cred,Csweet}.

Thesetofannotationsforγ′thenisdefinedasanorderedsetoflexicalitemsTC(γ′)={label(C′

1),label(C′2),...}.Theseannotationsarethecorresponding

linguistictermstotheorderedconceptsinOC(γ′),thatareretrievedfromthelabelsinVγ′,CfortheconceptsinC(γ′).

Example4.11.FortheconceptualspaceofleavespresentedinExample3.5,supposeanunknownleafsampleγ′isassociatedwithbothknowncon-ceptsTiliaandNerium.AssumethatVγ′,C(Ctt)=(label(Ctt),0.5)andVγ′,C(Cno)=(label(Cno),0.9).Then,TC(γ′)={label(Cno)=‘Nerium’,label(Ctt)=‘somewhatTilia’}.

CharacterisationintheQualityLayer

Characterisationforγ′istoassignthelinguisticdescriptionsoftheassociatedqualitydimensionbasedonthevaluesinthequalitylayerofthesymbolvectorVγ′,Q.Themotivationbehindthecharacterisationcomesfromthelackoftheconceptannotationinthecaseswithnoassociatedconceptwithinthedomains(likecasefourandpartiallycasetwoinFigure4.4).Thisisespeciallyimportantforthoseinstancesthatarecompletelyunknownforthesystemsandarenotrepresentablebyanyofthedefinedconcepts,butstillareexplainablewiththeirqualitydimensions’values.

AccordingtoAlgorithm4.1,ifγ′withinadomaindoesnotbelongtoanysub-concept,thenVγ′,Qgetsvaluesfromthedomain’squalitydimensions.Ob-viously,ifγ′hasevenoneassociatedconceptwithinthedomain,thereisnoneedtoinvolvethequalitydimensionsofthatdomaininthecharacterisationprocess.ForthecalculatedVγ′,Q,eachqualitydimensionqisincludedinthesetofassociatedqualitydimensionsQ(γ′),ifthecorrespondingelementsintheVγ′,Qarenotempty.Formally,q∈Q(γ′)⇐⇒Vγ′,Q(q)=(∅,0).

Example4.12.ConsideringFigure4.4b,γ′isnotassociatedwithanycon-ceptswithinδa.So,Q(γ′)={q

a1,q

a2}.Also,inFigure4.4d,therearenoas-

sociatedconceptsinanyofthedomains.So,Q(γ′)={qa1,q

a2,q

b1,q

b2,q

b3}.In

Figures4.4aand4.4c,Q(γ′)=∅.


Example 4.10. Considering the case three in Figure 4.4c, since γ ′ is located insub-concept cbyi

with graded membership 1 and in sub-concept cayjwith graded

membership 0.9, then the corresponding ordered set of concepts for γ ′ will be:OC(γ

′) = {Cyi,Cyj

}. Suppose there is an unknown instance γ ′ in a conceptualspace including Colour and Taste domains. If γ ′ is located in both ’red’ sub-concept with grading degree 0.95 and ’sweet’ sub-concept with grading degree0.7 (in colour and taste domains, respectively), then the corresponding orderedset of concepts is: OC(γ

′) = {Cred,Csweet}.

The set of annotations for γ ′ then is defined as an ordered set of lexical itemsTC(γ

′) = {label(C ′1), label(C

′2), . . .}. These annotations are the corresponding

linguistic terms to the ordered concepts in OC(γ′), that are retrieved from the

labels in Vγ′,C for the concepts in C(γ ′).

Example 4.11. For the conceptual space of leaves presented in Example 3.5,suppose an unknown leaf sample γ ′ is associated with both known con-cepts Tilia and Nerium. Assume that Vγ′,C(Ctt) = (label(Ctt), 0.5) andVγ′,C(Cno) = (label(Cno), 0.9). Then, TC(γ

′) = {label(Cno) = ‘Nerium’,label(Ctt) = ‘somewhat Tilia’}.

Characterisation in the Quality Layer

Characterisation for γ ′ is to assign the linguistic descriptions of the associatedquality dimension based on the values in the quality layer of the symbol vectorVγ′,Q. The motivation behind the characterisation comes from the lack of theconcept annotation in the cases with no associated concept within the domains(like case four and partially case two in Figure 4.4). This is especially importantfor those instances that are completely unknown for the systems and are notrepresentable by any of the defined concepts, but still are explainable with theirquality dimensions’ values.

According to Algorithm 4.1, if γ ′ within a domain does not belong to anysub-concept, then Vγ′,Q gets values from the domain’s quality dimensions. Ob-viously, if γ ′ has even one associated concept within the domain, there is noneed to involve the quality dimensions of that domain in the characterisationprocess. For the calculated Vγ′,Q, each quality dimension q is included in theset of associated quality dimensions Q(γ ′), if the corresponding elements in theVγ′,Q are not empty. Formally, q ∈ Q(γ ′) ⇐⇒ Vγ′,Q(q) = (∅, 0).

Example 4.12. Considering Figure 4.4b, γ ′ is not associated with any con-cepts within δa. So, Q(γ ′) = {qa

1 ,qa2 }. Also, in Figure 4.4d, there are no as-

sociated concepts in any of the domains. So, Q(γ ′) = {qa1 ,qa

2 ,qb1 ,qb

2 ,qb3 }. In

Figures 4.4a and 4.4c, Q(γ ′) = ∅.


Example4.10.ConsideringthecasethreeinFigure4.4c,sinceγ′islocatedinsub-conceptc

byiwithgradedmembership1andinsub-conceptc

ayjwithgraded

membership0.9,thenthecorrespondingorderedsetofconceptsforγ′willbe:OC(γ′)={Cyi,Cyj}.Supposethereisanunknowninstanceγ′inaconceptualspaceincludingColourandTastedomains.Ifγ′islocatedinboth’red’sub-conceptwithgradingdegree0.95and’sweet’sub-conceptwithgradingdegree0.7(incolourandtastedomains,respectively),thenthecorrespondingorderedsetofconceptsis:OC(γ′)={Cred,Csweet}.

Thesetofannotationsforγ′thenisdefinedasanorderedsetoflexicalitemsTC(γ′)={label(C′

1),label(C′2),...}.Theseannotationsarethecorresponding

linguistictermstotheorderedconceptsinOC(γ′),thatareretrievedfromthelabelsinVγ′,CfortheconceptsinC(γ′).

Example4.11.FortheconceptualspaceofleavespresentedinExample3.5,supposeanunknownleafsampleγ′isassociatedwithbothknowncon-ceptsTiliaandNerium.AssumethatVγ′,C(Ctt)=(label(Ctt),0.5)andVγ′,C(Cno)=(label(Cno),0.9).Then,TC(γ′)={label(Cno)=‘Nerium’,label(Ctt)=‘somewhatTilia’}.

CharacterisationintheQualityLayer

Characterisationforγ′istoassignthelinguisticdescriptionsoftheassociatedqualitydimensionbasedonthevaluesinthequalitylayerofthesymbolvectorVγ′,Q.Themotivationbehindthecharacterisationcomesfromthelackoftheconceptannotationinthecaseswithnoassociatedconceptwithinthedomains(likecasefourandpartiallycasetwoinFigure4.4).Thisisespeciallyimportantforthoseinstancesthatarecompletelyunknownforthesystemsandarenotrepresentablebyanyofthedefinedconcepts,butstillareexplainablewiththeirqualitydimensions’values.

AccordingtoAlgorithm4.1,ifγ′withinadomaindoesnotbelongtoanysub-concept,thenVγ′,Qgetsvaluesfromthedomain’squalitydimensions.Ob-viously,ifγ′hasevenoneassociatedconceptwithinthedomain,thereisnoneedtoinvolvethequalitydimensionsofthatdomaininthecharacterisationprocess.ForthecalculatedVγ′,Q,eachqualitydimensionqisincludedinthesetofassociatedqualitydimensionsQ(γ′),ifthecorrespondingelementsintheVγ′,Qarenotempty.Formally,q∈Q(γ′)⇐⇒Vγ′,Q(q)=(∅,0).

Example4.12.ConsideringFigure4.4b,γ′isnotassociatedwithanycon-ceptswithinδa.So,Q(γ′)={q

a1,q

a2}.Also,inFigure4.4d,therearenoas-

sociatedconceptsinanyofthedomains.So,Q(γ′)={qa1,q

a2,q

b1,q

b2,q

b3}.In

Figures4.4aand4.4c,Q(γ′)=∅.


Based on Equation 4.6, labelγ′,q = label(Abestγ′,q ) is the linguistic term re-

lated to the best interval of the quality dimension with the maximum degree ofmembership.5

Similar to the process of annotation, a sorting operation is needed to derivethe order of quality dimension labels in the final linguistic descriptions. Relyingon the graded quality values in Vγ′,Q, an ordered set of the associated qualitydimensions, OQ(γ

′) = {q ′1,q ′

2, . . .}, is defined. The characterisation set for γ ′

then is simply defined as an ordered set of lexical items TQ(γ′) = {label(Abest

γ′,q′1),

label(Abestγ′,q′

2), . . .}. These characterisations are retrieved from the correspond-

ing labels in Vγ′.Q, for the quality dimensions in Q(γ ′).

Example 4.13. Considering Example 4.12, suppose γ ′ is not associated withany known concepts. Assume that for the quality dimensions elongation androundness, Vγ′,Q(qel) = (label(Aeld), 0.9) and Vγ′,Q(qro) = (label(Aro), 0.7)(referring to Example 4.6). Then, TQ(γ

′) = {label(Aeld) = ‘elongated’,label(Aro) = ‘lanced shape’}. For the above example, if the values of dimen-sions (with the value range [0 1]) are vweight = 0.9 and vhight = 0.2, then thecorresponding characterisation set will be TQ(γ

′) = {label(αqweight) =′ heavy ′,

label(αqlength) =′ short length ′}.

Phrase Specification and Linguistic Realisation

Before applying the linguistic realisation, there is a need to specify an abstractrepresentation of the provided set of lexicons. According to [185], messagesare the abstractions that mediates between the set of lexicons and eventualtext. Here, based on the annotation and characterisation lexicons, two types ofmessages can be defined as: AnnotationMsg and CharacterisationMsg. Phrasespecification is a structure to specify the elements of a message for a singlesentence, which is the proper representation of the output for microplanningand the input for realisation. Various levels of abstraction have been proposedfor phrase specification such as Syntactic structure, Canned text, Case frame,etc. [185]. Since the aim was to describe an observation with its attribute labels,the complexity of the final sentence is limited to a reasonably straightforwardtemplate of abstract representation. Here the syntactic structure is employed,which describes the linguistic elements to be used as uninflected words, plus aset of features to determine how to realise the final text. Figure 4.6 illustratesthe simple template for the final text in an attribute value matrix (AVM) for-mat [185,237].

5The linguistic name of a quality dimension label(q) is Hq, as defined in Definition 3.2, butthis name can be also added to the linguistic label of the fuzzy intervals on quality dimension. Asan example, for two dimensions ‘time’ and ‘length’, the linguistic labels of the first intervals in bothcan be ‘short’, but to be precise, the labels can consist of the dimension names to be defined as‘short time’ and ‘short length’, respectively.


BasedonEquation4.6,labelγ′,q=label(Abestγ′,q)isthelinguistictermre-

latedtothebestintervalofthequalitydimensionwiththemaximumdegreeofmembership.5

Similartotheprocessofannotation,asortingoperationisneededtoderivetheorderofqualitydimensionlabelsinthefinallinguisticdescriptions.RelyingonthegradedqualityvaluesinVγ′,Q,anorderedsetoftheassociatedqualitydimensions,OQ(γ′)={q′

1,q′2,...},isdefined.Thecharacterisationsetforγ′

thenissimplydefinedasanorderedsetoflexicalitemsTQ(γ′)={label(Abestγ′,q′

1),


2),...}.Thesecharacterisationsareretrievedfromthecorrespond-

inglabelsinVγ′.Q,forthequalitydimensionsinQ(γ′).

Example4.13.ConsideringExample4.12,supposeγ′isnotassociatedwithanyknownconcepts.Assumethatforthequalitydimensionselongationandroundness,Vγ′,Q(qel)=(label(Aeld),0.9)andVγ′,Q(qro)=(label(Aro),0.7)(referringtoExample4.6).Then,TQ(γ′)={label(Aeld)=‘elongated’,label(Aro)=‘lancedshape’}.Fortheaboveexample,ifthevaluesofdimen-sions(withthevaluerange[01])arevweight=0.9andvhight=0.2,thenthecorrespondingcharacterisationsetwillbeTQ(γ′)={label(αqweight)=′heavy′,label(αqlength)=′shortlength′}.

PhraseSpecificationandLinguisticRealisation

Beforeapplyingthelinguisticrealisation,thereisaneedtospecifyanabstractrepresentationoftheprovidedsetoflexicons.Accordingto[185],messagesaretheabstractionsthatmediatesbetweenthesetoflexiconsandeventualtext.Here,basedontheannotationandcharacterisationlexicons,twotypesofmessagescanbedefinedas:AnnotationMsgandCharacterisationMsg.Phrasespecificationisastructuretospecifytheelementsofamessageforasinglesentence,whichistheproperrepresentationoftheoutputformicroplanningandtheinputforrealisation.VariouslevelsofabstractionhavebeenproposedforphrasespecificationsuchasSyntacticstructure,Cannedtext,Caseframe,etc.[185].Sincetheaimwastodescribeanobservationwithitsattributelabels,thecomplexityofthefinalsentenceislimitedtoareasonablystraightforwardtemplateofabstractrepresentation.Herethesyntacticstructureisemployed,whichdescribesthelinguisticelementstobeusedasuninflectedwords,plusasetoffeaturestodeterminehowtorealisethefinaltext.Figure4.6illustratesthesimpletemplateforthefinaltextinanattributevaluematrix(AVM)for-mat[185,237].

5Thelinguisticnameofaqualitydimensionlabel(q)isHq,asdefinedinDefinition3.2,butthisnamecanbealsoaddedtothelinguisticlabelofthefuzzyintervalsonqualitydimension.Asanexample,fortwodimensions‘time’and‘length’,thelinguisticlabelsofthefirstintervalsinbothcanbe‘short’,buttobeprecise,thelabelscanconsistofthedimensionnamestobedefinedas‘shorttime’and‘shortlength’,respectively.


Based on Equation 4.6, labelγ′,q = label(Abestγ′,q ) is the linguistic term re-

lated to the best interval of the quality dimension with the maximum degree ofmembership.5

Similar to the process of annotation, a sorting operation is needed to derivethe order of quality dimension labels in the final linguistic descriptions. Relyingon the graded quality values in Vγ′,Q, an ordered set of the associated qualitydimensions, OQ(γ

′) = {q ′1,q ′

2, . . .}, is defined. The characterisation set for γ ′

then is simply defined as an ordered set of lexical items TQ(γ′) = {label(Abest

γ′,q′1),


2), . . .}. These characterisations are retrieved from the correspond-

ing labels in Vγ′.Q, for the quality dimensions in Q(γ ′).

Example 4.13. Considering Example 4.12, suppose γ ′ is not associated withany known concepts. Assume that for the quality dimensions elongation androundness, Vγ′,Q(qel) = (label(Aeld), 0.9) and Vγ′,Q(qro) = (label(Aro), 0.7)(referring to Example 4.6). Then, TQ(γ

′) = {label(Aeld) = ‘elongated’,label(Aro) = ‘lanced shape’}. For the above example, if the values of dimen-sions (with the value range [0 1]) are vweight = 0.9 and vhight = 0.2, then thecorresponding characterisation set will be TQ(γ

′) = {label(αqweight) =′ heavy ′,

label(αqlength) =′ short length ′}.

Phrase Specification and Linguistic Realisation

Before applying the linguistic realisation, there is a need to specify an abstractrepresentation of the provided set of lexicons. According to [185], messagesare the abstractions that mediates between the set of lexicons and eventualtext. Here, based on the annotation and characterisation lexicons, two types ofmessages can be defined as: AnnotationMsg and CharacterisationMsg. Phrasespecification is a structure to specify the elements of a message for a singlesentence, which is the proper representation of the output for microplanningand the input for realisation. Various levels of abstraction have been proposedfor phrase specification such as Syntactic structure, Canned text, Case frame,etc. [185]. Since the aim was to describe an observation with its attribute labels,the complexity of the final sentence is limited to a reasonably straightforwardtemplate of abstract representation. Here the syntactic structure is employed,which describes the linguistic elements to be used as uninflected words, plus aset of features to determine how to realise the final text. Figure 4.6 illustratesthe simple template for the final text in an attribute value matrix (AVM) for-mat [185,237].

5The linguistic name of a quality dimension label(q) is Hq, as defined in Definition 3.2, butthis name can be also added to the linguistic label of the fuzzy intervals on quality dimension. Asan example, for two dimensions ‘time’ and ‘length’, the linguistic labels of the first intervals in bothcan be ‘short’, but to be precise, the labels can consist of the dimension names to be defined as‘short time’ and ‘short length’, respectively.


BasedonEquation4.6,labelγ′,q=label(Abestγ′,q)isthelinguistictermre-

latedtothebestintervalofthequalitydimensionwiththemaximumdegreeofmembership.5

Similartotheprocessofannotation,asortingoperationisneededtoderivetheorderofqualitydimensionlabelsinthefinallinguisticdescriptions.RelyingonthegradedqualityvaluesinVγ′,Q,anorderedsetoftheassociatedqualitydimensions,OQ(γ′)={q′

1,q′2,...},isdefined.Thecharacterisationsetforγ′

thenissimplydefinedasanorderedsetoflexicalitemsTQ(γ′)={label(Abestγ′,q′

1),


2),...}.Thesecharacterisationsareretrievedfromthecorrespond-

inglabelsinVγ′.Q,forthequalitydimensionsinQ(γ′).

Example4.13.ConsideringExample4.12,supposeγ′isnotassociatedwithanyknownconcepts.Assumethatforthequalitydimensionselongationandroundness,Vγ′,Q(qel)=(label(Aeld),0.9)andVγ′,Q(qro)=(label(Aro),0.7)(referringtoExample4.6).Then,TQ(γ′)={label(Aeld)=‘elongated’,label(Aro)=‘lancedshape’}.Fortheaboveexample,ifthevaluesofdimen-sions(withthevaluerange[01])arevweight=0.9andvhight=0.2,thenthecorrespondingcharacterisationsetwillbeTQ(γ′)={label(αqweight)=′heavy′,label(αqlength)=′shortlength′}.

PhraseSpecificationandLinguisticRealisation

Beforeapplyingthelinguisticrealisation,thereisaneedtospecifyanabstractrepresentationoftheprovidedsetoflexicons.Accordingto[185],messagesaretheabstractionsthatmediatesbetweenthesetoflexiconsandeventualtext.Here,basedontheannotationandcharacterisationlexicons,twotypesofmessagescanbedefinedas:AnnotationMsgandCharacterisationMsg.Phrasespecificationisastructuretospecifytheelementsofamessageforasinglesentence,whichistheproperrepresentationoftheoutputformicroplanningandtheinputforrealisation.VariouslevelsofabstractionhavebeenproposedforphrasespecificationsuchasSyntacticstructure,Cannedtext,Caseframe,etc.[185].Sincetheaimwastodescribeanobservationwithitsattributelabels,thecomplexityofthefinalsentenceislimitedtoareasonablystraightforwardtemplateofabstractrepresentation.Herethesyntacticstructureisemployed,whichdescribesthelinguisticelementstobeusedasuninflectedwords,plusasetoffeaturestodeterminehowtorealisethefinaltext.Figure4.6illustratesthesimpletemplateforthefinaltextinanattributevaluematrix(AVM)for-mat[185,237].

5Thelinguisticnameofaqualitydimensionlabel(q)isHq,asdefinedinDefinition3.2,butthisnamecanbealsoaddedtothelinguisticlabelofthefuzzyintervalsonqualitydimension.Asanexample,fortwodimensions‘time’and‘length’,thelinguisticlabelsofthefirstintervalsinbothcanbe‘short’,buttobeprecise,thelabelscanconsistofthedimensionnamestobedefinedas‘shorttime’and‘shortlength’,respectively.


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxconjunct: | also/but |

AnnotationMsg:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | be/be like |features: [present, positive/negative]

subject:[

Type: PSCannedTexttext: | this [observation (γ ′)] |

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | and |

conjunct1:

⎡⎣ Type: PSAbstractSyntax

head: TC(γ′, 1)

features: [definite: false]

⎤⎦

conjunct2:


head: TC(γ′, 2)


⎤⎦

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

CharacterisationMsg:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | be/have |features: [present, positive]

subject:[

Type: PSCannedTexttext: | it |

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | and/with |

conjunct1:[

Type: PSAbstractSyntaxhead: TQ(γ

′, 1)

]

conjunct2:[


′, 2)

]

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

Figure 4.6: An abstract representation for the annotation and characterisationmessages in the AVM format.

Linguistic realisation is the process of applying a set of rules to abstractrepresentations of the lexical items in order to specify well-formed sentences innatural language, which are syntactically and morphologically correct. Morespecifically, the realisation maps the acquired abstract phrase specifications intothe surface text [185]. Some of the linguistic realisations represent sentences bytemplate-like structures when only limited syntactic variability is needed in theoutput description [184]. One instance of an output format for the sentencesto describe the observations is as follows:

“This [Obs.] [be/be not/be like] [a con.label1] [and [a con.label2] and ...],[but/also] it [be/have] [dim.label1] [and/with [dim.label2] and/with ... ].”

As this descriptive sentence is linguistically formed in a simple format, ap-plying any realisation technique (e.g., SimpleNLG engine) will produce gram-matically correct sentences as the output text. The standard architecture of


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxconjunct:|also/but|

AnnotationMsg:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|be/belike|features:[present,positive/negative]

subject:[Type:PSCannedText

text:|this[observation(γ′)]|

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|and|

conjunct1:

⎡⎣

Type:PSAbstractSyntaxhead:TC(γ′,1)features:[definite:false]

⎤⎦

conjunct2:

⎡⎣


⎤⎦

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|be/have|features:[present,positive]


text:|it|

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|and/with|

conjunct1:[Type:PSAbstractSyntax

head:TQ(γ′,1)

]


head:TQ(γ′,2)

]

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

Figure4.6:AnabstractrepresentationfortheannotationandcharacterisationmessagesintheAVMformat.

Linguisticrealisationistheprocessofapplyingasetofrulestoabstractrepresentationsofthelexicalitemsinordertospecifywell-formedsentencesinnaturallanguage,whicharesyntacticallyandmorphologicallycorrect.Morespecifically,therealisationmapstheacquiredabstractphrasespecificationsintothesurfacetext[185].Someofthelinguisticrealisationsrepresentsentencesbytemplate-likestructureswhenonlylimitedsyntacticvariabilityisneededintheoutputdescription[184].Oneinstanceofanoutputformatforthesentencestodescribetheobservationsisasfollows:

“This[Obs.][be/benot/belike][acon.label1][and[acon.label2]and...],[but/also]it[be/have][dim.label1][and/with[dim.label2]and/with...].”

Asthisdescriptivesentenceislinguisticallyformedinasimpleformat,ap-plyinganyrealisationtechnique(e.g.,SimpleNLGengine)willproducegram-maticallycorrectsentencesastheoutputtext.Thestandardarchitectureof


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxconjunct: | also/but |

AnnotationMsg:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | be/be like |features: [present, positive/negative]

subject:[

Type: PSCannedTexttext: | this [observation (γ ′)] |

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | and |

conjunct1:


head: TC(γ′, 1)


⎤⎦

conjunct2:


head: TC(γ′, 2)


⎤⎦

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | be/have |features: [present, positive]

subject:[

Type: PSCannedTexttext: | it |

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type: PSAbstractSyntaxhead: | and/with |

conjunct1:[


′, 1)

]

conjunct2:[


′, 2)

]

. . .

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

Figure 4.6: An abstract representation for the annotation and characterisationmessages in the AVM format.

Linguistic realisation is the process of applying a set of rules to abstractrepresentations of the lexical items in order to specify well-formed sentences innatural language, which are syntactically and morphologically correct. Morespecifically, the realisation maps the acquired abstract phrase specifications intothe surface text [185]. Some of the linguistic realisations represent sentences bytemplate-like structures when only limited syntactic variability is needed in theoutput description [184]. One instance of an output format for the sentencesto describe the observations is as follows:

“This [Obs.] [be/be not/be like] [a con.label1] [and [a con.label2] and ...],[but/also] it [be/have] [dim.label1] [and/with [dim.label2] and/with ... ].”

As this descriptive sentence is linguistically formed in a simple format, ap-plying any realisation technique (e.g., SimpleNLG engine) will produce gram-matically correct sentences as the output text. The standard architecture of


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxconjunct:|also/but|

AnnotationMsg:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|be/belike|features:[present,positive/negative]


text:|this[observation(γ′)]|

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|and|

conjunct1:

⎡⎣


⎤⎦

conjunct2:

⎡⎣


⎤⎦

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦


⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|be/have|features:[present,positive]


text:|it|

]

object:

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

Type:PSAbstractSyntaxhead:|and/with|


head:TQ(γ′,1)

]


head:TQ(γ′,2)

]

...

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

Figure4.6:AnabstractrepresentationfortheannotationandcharacterisationmessagesintheAVMformat.

Linguisticrealisationistheprocessofapplyingasetofrulestoabstractrepresentationsofthelexicalitemsinordertospecifywell-formedsentencesinnaturallanguage,whicharesyntacticallyandmorphologicallycorrect.Morespecifically,therealisationmapstheacquiredabstractphrasespecificationsintothesurfacetext[185].Someofthelinguisticrealisationsrepresentsentencesbytemplate-likestructureswhenonlylimitedsyntacticvariabilityisneededintheoutputdescription[184].Oneinstanceofanoutputformatforthesentencestodescribetheobservationsisasfollows:

“This[Obs.][be/benot/belike][acon.label1][and[acon.label2]and...],[but/also]it[be/have][dim.label1][and/with[dim.label2]and/with...].”

Asthisdescriptivesentenceislinguisticallyformedinasimpleformat,ap-plyinganyrealisationtechnique(e.g.,SimpleNLGengine)willproducegram-maticallycorrectsentencesastheoutputtext.Thestandardarchitectureof


NLG systems provides a well-defined realisation for the abstracted represen-tations, which the details of the architecture can be found in [185] and theimplementation details are available in [95].

Example 4.14. Consider the set of annotations and characterisations from Ex-amples 4.11 and 4.13. Then, the output of realisation will be a message like:“This unknown leaf observation is like Nerium leaves and somewhat Tilialeaves, but it is an elongated and a lance-shaped leaf.”

4.3 Discussion

This chapter has presented the utility of extending conceptual spaces as seman-tic inference models to generate linguistic descriptions for unknown observa-tions in natural language. It has been shown that linking a symbol space to theconceptual space facilitates the process of inferring linguistic characterisationsfor the unknown observations. One advantage of this inference model is thatthe generated descriptions include adequate set of interpretable features thatare derived from the inclusion of unknown observations within the conceptualspace. following, a number of issues related to the semantic inference approachare discussed.

Lexicalisation: On the inference process, one point related to the lexicalisa-tion task is to determine the most descriptive terms to use in order to linguis-tically represent a new observation. As seen in the Leaf example, the most ob-vious psychological description could be regarding a leaf’s similarity or dissim-ilarity to the known concepts. Thus, either the linguistic labels of the knownconcepts or the linguistic terms of quality dimensions which are stored in sym-bol space can be used. So, the semantic interpretability of such labels will affectdirectly on the descriptive quality of the natural output text.

Concepts as nouns or adjectives: One point related to the lexicalisation andrealisation is that from the natural language point of view, the concepts in aconceptual space typically represent the nouns while the sub-concepts or prop-erties are related to the adjectives. The approach to determine linguistic de-scriptions does not make the distinction between adjectives and nouns, andconsequently, between the properties and concepts for the class labels from theinput learning data set6. Preferably, one region within a domain is considered asa concept or sub-concept that can be either nouns or adjectives, and be used to

6The reason is that from machine learning point of view, there is no linguistic distinction be-tween the different types of the classes, in a sense that the learning problem is to classify whetherthe noun concepts, adjective concepts, or even verb concepts. For example, in Iris data set theclasses are nouns, as each class refers to a name of iris plant [146], however, in Wine quality dataset the classes are adjectives, as each class refers to the quality level of wines from excellent to poorquality [64].


NLGsystemsprovidesawell-definedrealisationfortheabstractedrepresen-tations,whichthedetailsofthearchitecturecanbefoundin[185]andtheimplementationdetailsareavailablein[95].

Example4.14.ConsiderthesetofannotationsandcharacterisationsfromEx-amples4.11and4.13.Then,theoutputofrealisationwillbeamessagelike:“ThisunknownleafobservationislikeNeriumleavesandsomewhatTilialeaves,butitisanelongatedandalance-shapedleaf.”

4.3Discussion

Thischapterhaspresentedtheutilityofextendingconceptualspacesasseman-ticinferencemodelstogeneratelinguisticdescriptionsforunknownobserva-tionsinnaturallanguage.Ithasbeenshownthatlinkingasymbolspacetotheconceptualspacefacilitatestheprocessofinferringlinguisticcharacterisationsfortheunknownobservations.Oneadvantageofthisinferencemodelisthatthegenerateddescriptionsincludeadequatesetofinterpretablefeaturesthatarederivedfromtheinclusionofunknownobservationswithintheconceptualspace.following,anumberofissuesrelatedtothesemanticinferenceapproacharediscussed.

Lexicalisation:Ontheinferenceprocess,onepointrelatedtothelexicalisa-tiontaskistodeterminethemostdescriptivetermstouseinordertolinguis-ticallyrepresentanewobservation.AsseenintheLeafexample,themostob-viouspsychologicaldescriptioncouldberegardingaleaf’ssimilarityordissim-ilaritytotheknownconcepts.Thus,eitherthelinguisticlabelsoftheknownconceptsorthelinguistictermsofqualitydimensionswhicharestoredinsym-bolspacecanbeused.So,thesemanticinterpretabilityofsuchlabelswillaffectdirectlyonthedescriptivequalityofthenaturaloutputtext.

Conceptsasnounsoradjectives:Onepointrelatedtothelexicalisationandrealisationisthatfromthenaturallanguagepointofview,theconceptsinaconceptualspacetypicallyrepresentthenounswhilethesub-conceptsorprop-ertiesarerelatedtotheadjectives.Theapproachtodeterminelinguisticde-scriptionsdoesnotmakethedistinctionbetweenadjectivesandnouns,andconsequently,betweenthepropertiesandconceptsfortheclasslabelsfromtheinputlearningdataset6.Preferably,oneregionwithinadomainisconsideredasaconceptorsub-conceptthatcanbeeithernounsoradjectives,andbeusedto

6Thereasonisthatfrommachinelearningpointofview,thereisnolinguisticdistinctionbe-tweenthedifferenttypesoftheclasses,inasensethatthelearningproblemistoclassifywhetherthenounconcepts,adjectiveconcepts,orevenverbconcepts.Forexample,inIrisdatasettheclassesarenouns,aseachclassreferstoanameofirisplant[146],however,inWinequalitydatasettheclassesareadjectives,aseachclassreferstothequalitylevelofwinesfromexcellenttopoorquality[64].


NLG systems provides a well-defined realisation for the abstracted represen-tations, which the details of the architecture can be found in [185] and theimplementation details are available in [95].

Example 4.14. Consider the set of annotations and characterisations from Ex-amples 4.11 and 4.13. Then, the output of realisation will be a message like:“This unknown leaf observation is like Nerium leaves and somewhat Tilialeaves, but it is an elongated and a lance-shaped leaf.”

4.3 Discussion

This chapter has presented the utility of extending conceptual spaces as seman-tic inference models to generate linguistic descriptions for unknown observa-tions in natural language. It has been shown that linking a symbol space to theconceptual space facilitates the process of inferring linguistic characterisationsfor the unknown observations. One advantage of this inference model is thatthe generated descriptions include adequate set of interpretable features thatare derived from the inclusion of unknown observations within the conceptualspace. following, a number of issues related to the semantic inference approachare discussed.

Lexicalisation: On the inference process, one point related to the lexicalisa-tion task is to determine the most descriptive terms to use in order to linguis-tically represent a new observation. As seen in the Leaf example, the most ob-vious psychological description could be regarding a leaf’s similarity or dissim-ilarity to the known concepts. Thus, either the linguistic labels of the knownconcepts or the linguistic terms of quality dimensions which are stored in sym-bol space can be used. So, the semantic interpretability of such labels will affectdirectly on the descriptive quality of the natural output text.

Concepts as nouns or adjectives: One point related to the lexicalisation andrealisation is that from the natural language point of view, the concepts in aconceptual space typically represent the nouns while the sub-concepts or prop-erties are related to the adjectives. The approach to determine linguistic de-scriptions does not make the distinction between adjectives and nouns, andconsequently, between the properties and concepts for the class labels from theinput learning data set6. Preferably, one region within a domain is considered asa concept or sub-concept that can be either nouns or adjectives, and be used to

6The reason is that from machine learning point of view, there is no linguistic distinction be-tween the different types of the classes, in a sense that the learning problem is to classify whetherthe noun concepts, adjective concepts, or even verb concepts. For example, in Iris data set theclasses are nouns, as each class refers to a name of iris plant [146], however, in Wine quality dataset the classes are adjectives, as each class refers to the quality level of wines from excellent to poorquality [64].


NLGsystemsprovidesawell-definedrealisationfortheabstractedrepresen-tations,whichthedetailsofthearchitecturecanbefoundin[185]andtheimplementationdetailsareavailablein[95].

Example4.14.ConsiderthesetofannotationsandcharacterisationsfromEx-amples4.11and4.13.Then,theoutputofrealisationwillbeamessagelike:“ThisunknownleafobservationislikeNeriumleavesandsomewhatTilialeaves,butitisanelongatedandalance-shapedleaf.”

4.3Discussion

Thischapterhaspresentedtheutilityofextendingconceptualspacesasseman-ticinferencemodelstogeneratelinguisticdescriptionsforunknownobserva-tionsinnaturallanguage.Ithasbeenshownthatlinkingasymbolspacetotheconceptualspacefacilitatestheprocessofinferringlinguisticcharacterisationsfortheunknownobservations.Oneadvantageofthisinferencemodelisthatthegenerateddescriptionsincludeadequatesetofinterpretablefeaturesthatarederivedfromtheinclusionofunknownobservationswithintheconceptualspace.following,anumberofissuesrelatedtothesemanticinferenceapproacharediscussed.

Lexicalisation:Ontheinferenceprocess,onepointrelatedtothelexicalisa-tiontaskistodeterminethemostdescriptivetermstouseinordertolinguis-ticallyrepresentanewobservation.AsseenintheLeafexample,themostob-viouspsychologicaldescriptioncouldberegardingaleaf’ssimilarityordissim-ilaritytotheknownconcepts.Thus,eitherthelinguisticlabelsoftheknownconceptsorthelinguistictermsofqualitydimensionswhicharestoredinsym-bolspacecanbeused.So,thesemanticinterpretabilityofsuchlabelswillaffectdirectlyonthedescriptivequalityofthenaturaloutputtext.

Conceptsasnounsoradjectives:Onepointrelatedtothelexicalisationandrealisationisthatfromthenaturallanguagepointofview,theconceptsinaconceptualspacetypicallyrepresentthenounswhilethesub-conceptsorprop-ertiesarerelatedtotheadjectives.Theapproachtodeterminelinguisticde-scriptionsdoesnotmakethedistinctionbetweenadjectivesandnouns,andconsequently,betweenthepropertiesandconceptsfortheclasslabelsfromtheinputlearningdataset6.Preferably,oneregionwithinadomainisconsideredasaconceptorsub-conceptthatcanbeeithernounsoradjectives,andbeusedto

6Thereasonisthatfrommachinelearningpointofview,thereisnolinguisticdistinctionbe-tweenthedifferenttypesoftheclasses,inasensethatthelearningproblemistoclassifywhetherthenounconcepts,adjectiveconcepts,orevenverbconcepts.Forexample,inIrisdatasettheclassesarenouns,aseachclassreferstoanameofirisplant[146],however,inWinequalitydatasettheclassesareadjectives,aseachclassreferstothequalitylevelofwinesfromexcellenttopoorquality[64].

4.3. DISCUSSION 71

describe an observation. For example, descriptions of an object in a conceptualspace of fruits can be either “the object is an apple” or “the object is red”. Inthe former, the label refers to the concept apple as a noun, and in the latter one,the label refers to the sub-concept red as an adjective. It is assumed that all thelinguistic terms of the quality dimensions are considered as the adjectives. Forinstance, if weight is involved in the description of an object, the term heavy isconsidered as the adjective in such descriptions like “the object is heavy”.

In sum, this chapter together with chapter 3 has shown the process of creat-ing data-driven conceptual spaces and inferring linguistic description from suchspaces. The presented approaches in these two chapters will be exemplified inchapter 5 to show the goodness of the developed semantic representation forreal-world data samples.

4.3.DISCUSSION71

describeanobservation.Forexample,descriptionsofanobjectinaconceptualspaceoffruitscanbeeither“theobjectisanapple”or“theobjectisred”.Intheformer,thelabelreferstotheconceptappleasanoun,andinthelatterone,thelabelreferstothesub-conceptredasanadjective.Itisassumedthatallthelinguistictermsofthequalitydimensionsareconsideredastheadjectives.Forinstance,ifweightisinvolvedinthedescriptionofanobject,thetermheavyisconsideredastheadjectiveinsuchdescriptionslike“theobjectisheavy”.

Insum,thischaptertogetherwithchapter3hasshowntheprocessofcreat-ingdata-drivenconceptualspacesandinferringlinguisticdescriptionfromsuchspaces.Thepresentedapproachesinthesetwochapterswillbeexemplifiedinchapter5toshowthegoodnessofthedevelopedsemanticrepresentationforreal-worlddatasamples.

4.3. DISCUSSION 71

describe an observation. For example, descriptions of an object in a conceptualspace of fruits can be either “the object is an apple” or “the object is red”. Inthe former, the label refers to the concept apple as a noun, and in the latter one,the label refers to the sub-concept red as an adjective. It is assumed that all thelinguistic terms of the quality dimensions are considered as the adjectives. Forinstance, if weight is involved in the description of an object, the term heavy isconsidered as the adjective in such descriptions like “the object is heavy”.

In sum, this chapter together with chapter 3 has shown the process of creat-ing data-driven conceptual spaces and inferring linguistic description from suchspaces. The presented approaches in these two chapters will be exemplified inchapter 5 to show the goodness of the developed semantic representation forreal-world data samples.

4.3.DISCUSSION71

describeanobservation.Forexample,descriptionsofanobjectinaconceptualspaceoffruitscanbeeither“theobjectisanapple”or“theobjectisred”.Intheformer,thelabelreferstotheconceptappleasanoun,andinthelatterone,thelabelreferstothesub-conceptredasanadjective.Itisassumedthatallthelinguistictermsofthequalitydimensionsareconsideredastheadjectives.Forinstance,ifweightisinvolvedinthedescriptionofanobject,thetermheavyisconsideredastheadjectiveinsuchdescriptionslike“theobjectisheavy”.

Insum,thischaptertogetherwithchapter3hasshowntheprocessofcreat-ingdata-drivenconceptualspacesandinferringlinguisticdescriptionfromsuchspaces.Thepresentedapproachesinthesetwochapterswillbeexemplifiedinchapter5toshowthegoodnessofthedevelopedsemanticrepresentationforreal-worlddatasamples.

Chapter 5

Results and Evaluation: A Case

Study on Leaf Data Set

“It doesn’t matter how beautiful your theory is, it doesn’tmatter how smart you are. If it doesn’t agree withexperiment, it’s wrong.”

— Richard Feynman (1918–1988)

This chapter presents an assessment of the formal methods described in thechapters 3 and 4. A case study from a data set of leaves is investigated

to show the feasibility and the plausibility of the proposed approach in real-world applications. This chapter also describes an empirical evaluation that isdesigned to assess the goodness of the developed methods for describing un-known observations using the conceptual spaces. Finally, the results of thisempirical assessment for the leaf data set are presented.

The leaf data set [206] is a set of photographed leaf samples from differ-ent plant species. Here, six species are selected as the labelled data set (SeeFigure 5.1), and the rest of the species are used as unlabelled data. The leafdata set is a good first example, as it provides a tangible example of physicalobjects while the vocabulary used to describe the leaves is not necessarily famil-iar to non-specialists. For the case study on the leaf data set, the primitive setof features is initialised by expert-oriented questionnaires or domain-orientedbackground knowledge. These semantically interpretable features are describ-able in natural language and are able to distinguish the known classes fromeach other perceptually.

73

Chapter5

ResultsandEvaluation:ACase

StudyonLeafDataSet

“Itdoesn’tmatterhowbeautifulyourtheoryis,itdoesn’tmatterhowsmartyouare.Ifitdoesn’tagreewithexperiment,it’swrong.”

—RichardFeynman(1918–1988)

Thischapterpresentsanassessmentoftheformalmethodsdescribedinthechapters3and4.Acasestudyfromadatasetofleavesisinvestigated

toshowthefeasibilityandtheplausibilityoftheproposedapproachinreal-worldapplications.Thischapteralsodescribesanempiricalevaluationthatisdesignedtoassessthegoodnessofthedevelopedmethodsfordescribingun-knownobservationsusingtheconceptualspaces.Finally,theresultsofthisempiricalassessmentfortheleafdatasetarepresented.

Theleafdataset[206]isasetofphotographedleafsamplesfromdiffer-entplantspecies.Here,sixspeciesareselectedasthelabelleddataset(SeeFigure5.1),andtherestofthespeciesareusedasunlabelleddata.Theleafdatasetisagoodfirstexample,asitprovidesatangibleexampleofphysicalobjectswhilethevocabularyusedtodescribetheleavesisnotnecessarilyfamil-iartonon-specialists.Forthecasestudyontheleafdataset,theprimitivesetoffeaturesisinitialisedbyexpert-orientedquestionnairesordomain-orientedbackgroundknowledge.Thesesemanticallyinterpretablefeaturesaredescrib-ableinnaturallanguageandareabletodistinguishtheknownclassesfromeachotherperceptually.

73

Chapter 5

Results and Evaluation: A Case

Study on Leaf Data Set

“It doesn’t matter how beautiful your theory is, it doesn’tmatter how smart you are. If it doesn’t agree withexperiment, it’s wrong.”

— Richard Feynman (1918–1988)

This chapter presents an assessment of the formal methods described in thechapters 3 and 4. A case study from a data set of leaves is investigated

to show the feasibility and the plausibility of the proposed approach in real-world applications. This chapter also describes an empirical evaluation that isdesigned to assess the goodness of the developed methods for describing un-known observations using the conceptual spaces. Finally, the results of thisempirical assessment for the leaf data set are presented.

The leaf data set [206] is a set of photographed leaf samples from differ-ent plant species. Here, six species are selected as the labelled data set (SeeFigure 5.1), and the rest of the species are used as unlabelled data. The leafdata set is a good first example, as it provides a tangible example of physicalobjects while the vocabulary used to describe the leaves is not necessarily famil-iar to non-specialists. For the case study on the leaf data set, the primitive setof features is initialised by expert-oriented questionnaires or domain-orientedbackground knowledge. These semantically interpretable features are describ-able in natural language and are able to distinguish the known classes fromeach other perceptually.

73

Chapter5

ResultsandEvaluation:ACase

StudyonLeafDataSet

“Itdoesn’tmatterhowbeautifulyourtheoryis,itdoesn’tmatterhowsmartyouare.Ifitdoesn’tagreewithexperiment,it’swrong.”

—RichardFeynman(1918–1988)

Thischapterpresentsanassessmentoftheformalmethodsdescribedinthechapters3and4.Acasestudyfromadatasetofleavesisinvestigated

toshowthefeasibilityandtheplausibilityoftheproposedapproachinreal-worldapplications.Thischapteralsodescribesanempiricalevaluationthatisdesignedtoassessthegoodnessofthedevelopedmethodsfordescribingun-knownobservationsusingtheconceptualspaces.Finally,theresultsofthisempiricalassessmentfortheleafdatasetarepresented.

Theleafdataset[206]isasetofphotographedleafsamplesfromdiffer-entplantspecies.Here,sixspeciesareselectedasthelabelleddataset(SeeFigure5.1),andtherestofthespeciesareusedasunlabelleddata.Theleafdatasetisagoodfirstexample,asitprovidesatangibleexampleofphysicalobjectswhilethevocabularyusedtodescribetheleavesisnotnecessarilyfamil-iartonon-specialists.Forthecasestudyontheleafdataset,theprimitivesetoffeaturesisinitialisedbyexpert-orientedquestionnairesordomain-orientedbackgroundknowledge.Thesesemanticallyinterpretablefeaturesaredescrib-ableinnaturallanguageandareabletodistinguishtheknownclassesfromeachotherperceptually.

73

74 CHAPTER 5. RESULTS: LEAF CASE STUDY

Figure 5.1: Six species as the known leaves in the leaf data set. The first row oflabels shows the scientific names [206], and the second row of labels shows thecommon names [231].

5.1 Constructing a Conceptual Space of Leaves

The leaf data set includes 72 known leaf samples that are categorised in sixspecies. Formally, Dl = {o1, . . . ,o72} and Yl = { ysa : ‘Salix Atrocinera’,yqr : ‘Quercus Robur’, yia : ‘Ilex Aquifolium’, yno : ‘Nerium Oleander’, ytt :‘Tilia Tomentosa’, yap : ‘Acer Palmatum’ }. Figure 5.1 shows the prototypi-cal samples of leaf species for the leaf labels in Y, along with their popularnames.1 According to the set of class labels Yl, the set of concepts is definedas Cl = {Cia,Ctt,Cno,Cqr,Cap,Csa}. Here, the concepts are initiated, but therepresentation of each concept will be formally presented later.

The first step to build a conceptual space of leaves is to specify the initial setof features that characterises the leaves. The primary criterion while initialisingthe features is how descriptive or interpretable the chosen features are in thelinguistic form. In other words, this approach is looking for such features thatare representable in natural language with a perceptual interpretation. For ex-ample, the values of area and perimeter features might be useful for statisticalanalysis or classification tasks, but these features do not carry meaningful infor-mation to describe and distinguish the leaf observations. In contrast, a featurelike elongation meaningfully describes a perceptual feature of a leaf observa-tion.

Besides the leaf samples in different species, Silva et al. [206] have provideda set of attributes that describe the shape and texture features of leaves. Amongthe semantic attributes, the following features and used in the model as theinitial set of features Fl = {Xi : 〈HXi

, IXi〉}:

1Note that in the model, the scientific names of the leaves have been applied as labels used in theoriginal data set [206]. However, in the final descriptions for the evaluation, the common names ofleaves are used [231] which were more familiar to the general users.

74CHAPTER5.RESULTS:LEAFCASESTUDY

Figure5.1:Sixspeciesastheknownleavesintheleafdataset.Thefirstrowoflabelsshowsthescientificnames[206],andthesecondrowoflabelsshowsthecommonnames[231].

5.1ConstructingaConceptualSpaceofLeaves

Theleafdatasetincludes72knownleafsamplesthatarecategorisedinsixspecies.Formally,D

l={o1,...,o72}andY

l={ysa:‘SalixAtrocinera’,

yqr:‘QuercusRobur’,yia:‘IlexAquifolium’,yno:‘NeriumOleander’,ytt:‘TiliaTomentosa’,yap:‘AcerPalmatum’}.Figure5.1showstheprototypi-calsamplesofleafspeciesfortheleaflabelsinY,alongwiththeirpopularnames.1AccordingtothesetofclasslabelsY

l,thesetofconceptsisdefined

asCl={Cia,Ctt,Cno,Cqr,Cap,Csa}.Here,theconceptsareinitiated,butthe

representationofeachconceptwillbeformallypresentedlater.Thefirststeptobuildaconceptualspaceofleavesistospecifytheinitialset

offeaturesthatcharacterisestheleaves.Theprimarycriterionwhileinitialisingthefeaturesishowdescriptiveorinterpretablethechosenfeaturesareinthelinguisticform.Inotherwords,thisapproachislookingforsuchfeaturesthatarerepresentableinnaturallanguagewithaperceptualinterpretation.Forex-ample,thevaluesofareaandperimeterfeaturesmightbeusefulforstatisticalanalysisorclassificationtasks,butthesefeaturesdonotcarrymeaningfulinfor-mationtodescribeanddistinguishtheleafobservations.Incontrast,afeaturelikeelongationmeaningfullydescribesaperceptualfeatureofaleafobserva-tion.

Besidestheleafsamplesindifferentspecies,Silvaetal.[206]haveprovidedasetofattributesthatdescribetheshapeandtexturefeaturesofleaves.Amongthesemanticattributes,thefollowingfeaturesandusedinthemodelastheinitialsetoffeaturesF

l={Xi:〈HXi,IXi〉}:

1Notethatinthemodel,thescientificnamesoftheleaveshavebeenappliedaslabelsusedintheoriginaldataset[206].However,inthefinaldescriptionsfortheevaluation,thecommonnamesofleavesareused[231]whichweremorefamiliartothegeneralusers.


Figure 5.1: Six species as the known leaves in the leaf data set. The first row oflabels shows the scientific names [206], and the second row of labels shows thecommon names [231].

5.1 Constructing a Conceptual Space of Leaves

The leaf data set includes 72 known leaf samples that are categorised in sixspecies. Formally, Dl = {o1, . . . ,o72} and Yl = { ysa : ‘Salix Atrocinera’,yqr : ‘Quercus Robur’, yia : ‘Ilex Aquifolium’, yno : ‘Nerium Oleander’, ytt :‘Tilia Tomentosa’, yap : ‘Acer Palmatum’ }. Figure 5.1 shows the prototypi-cal samples of leaf species for the leaf labels in Y, along with their popularnames.1 According to the set of class labels Yl, the set of concepts is definedas Cl = {Cia,Ctt,Cno,Cqr,Cap,Csa}. Here, the concepts are initiated, but therepresentation of each concept will be formally presented later.

The first step to build a conceptual space of leaves is to specify the initial setof features that characterises the leaves. The primary criterion while initialisingthe features is how descriptive or interpretable the chosen features are in thelinguistic form. In other words, this approach is looking for such features thatare representable in natural language with a perceptual interpretation. For ex-ample, the values of area and perimeter features might be useful for statisticalanalysis or classification tasks, but these features do not carry meaningful infor-mation to describe and distinguish the leaf observations. In contrast, a featurelike elongation meaningfully describes a perceptual feature of a leaf observa-tion.

Besides the leaf samples in different species, Silva et al. [206] have provideda set of attributes that describe the shape and texture features of leaves. Amongthe semantic attributes, the following features and used in the model as theinitial set of features Fl = {Xi : 〈HXi

, IXi〉}:

1Note that in the model, the scientific names of the leaves have been applied as labels used in theoriginal data set [206]. However, in the final descriptions for the evaluation, the common names ofleaves are used [231] which were more familiar to the general users.


Figure5.1:Sixspeciesastheknownleavesintheleafdataset.Thefirstrowoflabelsshowsthescientificnames[206],andthesecondrowoflabelsshowsthecommonnames[231].

5.1ConstructingaConceptualSpaceofLeaves

Theleafdatasetincludes72knownleafsamplesthatarecategorisedinsixspecies.Formally,D

l={o1,...,o72}andY

l={ysa:‘SalixAtrocinera’,

yqr:‘QuercusRobur’,yia:‘IlexAquifolium’,yno:‘NeriumOleander’,ytt:‘TiliaTomentosa’,yap:‘AcerPalmatum’}.Figure5.1showstheprototypi-calsamplesofleafspeciesfortheleaflabelsinY,alongwiththeirpopularnames.1AccordingtothesetofclasslabelsY

l,thesetofconceptsisdefined

asCl={Cia,Ctt,Cno,Cqr,Cap,Csa}.Here,theconceptsareinitiated,butthe

representationofeachconceptwillbeformallypresentedlater.Thefirststeptobuildaconceptualspaceofleavesistospecifytheinitialset

offeaturesthatcharacterisestheleaves.Theprimarycriterionwhileinitialisingthefeaturesishowdescriptiveorinterpretablethechosenfeaturesareinthelinguisticform.Inotherwords,thisapproachislookingforsuchfeaturesthatarerepresentableinnaturallanguagewithaperceptualinterpretation.Forex-ample,thevaluesofareaandperimeterfeaturesmightbeusefulforstatisticalanalysisorclassificationtasks,butthesefeaturesdonotcarrymeaningfulinfor-mationtodescribeanddistinguishtheleafobservations.Incontrast,afeaturelikeelongationmeaningfullydescribesaperceptualfeatureofaleafobserva-tion.

Besidestheleafsamplesindifferentspecies,Silvaetal.[206]haveprovidedasetofattributesthatdescribetheshapeandtexturefeaturesofleaves.Amongthesemanticattributes,thefollowingfeaturesandusedinthemodelastheinitialsetoffeaturesF

l={Xi:〈HXi,IXi〉}:

1Notethatinthemodel,thescientificnamesoftheleaveshavebeenappliedaslabelsusedintheoriginaldataset[206].However,inthefinaldescriptionsfortheevaluation,thecommonnamesofleavesareused[231]whichweremorefamiliartothegeneralusers.

5.1. CONSTRUCTING A CONCEPTUAL SPACE OF LEAVES 75

ysa yqr yia yno ytt yap

Xar Xec Xel Xif Xlo Xmi Xso Xsc

Figure 5.2: The bipartite graph presenting the relevance of features and labelsin leaf data set. Also, three chosen bicliques (as the domains) are highlightedwith blue, red, and grey edges.

Xec : 〈‘Eccentricity’, [0, 1]〉 (eccentricity of the ellipse),Xar : 〈‘Aspect Ratio’, [1, inf)〉 (values close to 1 indicate an elongated shape),Xel : 〈‘Elongation’, [0, 1]〉 (minimum is achieved for a circular region),Xso : 〈‘Solidity’, (0, 1)〉, (how well the leaf fits a convex shape),Xif : 〈‘Isoperimetric Factor’, [1, inf)〉 (curvy intertwined contours yield lowvalues),Xlo : 〈‘Lobedness’, (0, inf)〉 (characterises how lobed a leaf is),Xmi : 〈‘Maximal Indentation Depth’, (0, 1)〉, (how deep are the indentations),Xsc : 〈‘Stochastic Convexity’, [0, 1]〉 , (probability of a random segment in aleaf to be fully contained).

5.1.1 Domain Specification for Leaf Data set

The values of these features for every observation are acquired from [206]. Af-ter that, the construction of the conceptual space is performed with the inputsof the labelled observations Dl, label set Yl, and feature set Fl. The algorithmfirst applies the feature filtering approach, i.e. MIFS (Algorithm 3.1) to pro-vide a ranking matrix which shows the mutual relations of features and labels.Then, using Algorithm 3.2, a feature subset grouping is performed. Figure 5.2illustrates the created bipartite graph, which leads to determining the domainsand quality dimensions.

The chosen bicliques (with the highest scores) determine the three domainsΔl = {δ1, δ2, δ3}, where each domain is specified as follows:

• Domain δ1 = 〈Q(δ1),C(δ1),ωδ1〉, whereinQ(δ1) = {qar, qel, qec},C(δ1) = {Cia,Ctt,Cno}.

• Domain δ2 = 〈Q(δ2),C(δ2),ωδ2〉, whereinQ(δ2) = {qso, qif},C(δ2) = {Cqr,Cap}.

5.1.CONSTRUCTINGACONCEPTUALSPACEOFLEAVES75

ysayqryiaynoyttyap

XarXecXelXifXloXmiXsoXsc

Figure5.2:Thebipartitegraphpresentingtherelevanceoffeaturesandlabelsinleafdataset.Also,threechosenbicliques(asthedomains)arehighlightedwithblue,red,andgreyedges.

Xec:〈‘Eccentricity’,[0,1]〉(eccentricityoftheellipse),Xar:〈‘AspectRatio’,[1,inf)〉(valuescloseto1indicateanelongatedshape),Xel:〈‘Elongation’,[0,1]〉(minimumisachievedforacircularregion),Xso:〈‘Solidity’,(0,1)〉,(howwelltheleaffitsaconvexshape),Xif:〈‘IsoperimetricFactor’,[1,inf)〉(curvyintertwinedcontoursyieldlowvalues),Xlo:〈‘Lobedness’,(0,inf)〉(characteriseshowlobedaleafis),Xmi:〈‘MaximalIndentationDepth’,(0,1)〉,(howdeeparetheindentations),Xsc:〈‘StochasticConvexity’,[0,1]〉,(probabilityofarandomsegmentinaleaftobefullycontained).

5.1.1DomainSpecificationforLeafDataset

Thevaluesofthesefeaturesforeveryobservationareacquiredfrom[206].Af-terthat,theconstructionoftheconceptualspaceisperformedwiththeinputsofthelabelledobservationsD

l,labelsetY

l,andfeaturesetF

l.Thealgorithm

firstappliesthefeaturefilteringapproach,i.e.MIFS(Algorithm3.1)topro-videarankingmatrixwhichshowsthemutualrelationsoffeaturesandlabels.Then,usingAlgorithm3.2,afeaturesubsetgroupingisperformed.Figure5.2illustratesthecreatedbipartitegraph,whichleadstodeterminingthedomainsandqualitydimensions.

Thechosenbicliques(withthehighestscores)determinethethreedomainsΔ

l={δ1,δ2,δ3},whereeachdomainisspecifiedasfollows:

•Domainδ1=〈Q(δ1),C(δ1),ωδ1〉,whereinQ(δ1)={qar,qel,qec},C(δ1)={Cia,Ctt,Cno}.

•Domainδ2=〈Q(δ2),C(δ2),ωδ2〉,whereinQ(δ2)={qso,qif},C(δ2)={Cqr,Cap}.

5.1. CONSTRUCTING A CONCEPTUAL SPACE OF LEAVES 75

ysa yqr yia yno ytt yap

Xar Xec Xel Xif Xlo Xmi Xso Xsc

Figure 5.2: The bipartite graph presenting the relevance of features and labelsin leaf data set. Also, three chosen bicliques (as the domains) are highlightedwith blue, red, and grey edges.

Xec : 〈‘Eccentricity’, [0, 1]〉 (eccentricity of the ellipse),Xar : 〈‘Aspect Ratio’, [1, inf)〉 (values close to 1 indicate an elongated shape),Xel : 〈‘Elongation’, [0, 1]〉 (minimum is achieved for a circular region),Xso : 〈‘Solidity’, (0, 1)〉, (how well the leaf fits a convex shape),Xif : 〈‘Isoperimetric Factor’, [1, inf)〉 (curvy intertwined contours yield lowvalues),Xlo : 〈‘Lobedness’, (0, inf)〉 (characterises how lobed a leaf is),Xmi : 〈‘Maximal Indentation Depth’, (0, 1)〉, (how deep are the indentations),Xsc : 〈‘Stochastic Convexity’, [0, 1]〉 , (probability of a random segment in aleaf to be fully contained).

5.1.1 Domain Specification for Leaf Data set

The values of these features for every observation are acquired from [206]. Af-ter that, the construction of the conceptual space is performed with the inputsof the labelled observations Dl, label set Yl, and feature set Fl. The algorithmfirst applies the feature filtering approach, i.e. MIFS (Algorithm 3.1) to pro-vide a ranking matrix which shows the mutual relations of features and labels.Then, using Algorithm 3.2, a feature subset grouping is performed. Figure 5.2illustrates the created bipartite graph, which leads to determining the domainsand quality dimensions.

The chosen bicliques (with the highest scores) determine the three domainsΔl = {δ1, δ2, δ3}, where each domain is specified as follows:

• Domain δ1 = 〈Q(δ1),C(δ1),ωδ1〉, whereinQ(δ1) = {qar, qel, qec},C(δ1) = {Cia,Ctt,Cno}.

• Domain δ2 = 〈Q(δ2),C(δ2),ωδ2〉, whereinQ(δ2) = {qso, qif},C(δ2) = {Cqr,Cap}.

5.1.CONSTRUCTINGACONCEPTUALSPACEOFLEAVES75

ysayqryiaynoyttyap

XarXecXelXifXloXmiXsoXsc

Figure5.2:Thebipartitegraphpresentingtherelevanceoffeaturesandlabelsinleafdataset.Also,threechosenbicliques(asthedomains)arehighlightedwithblue,red,andgreyedges.

Xec:〈‘Eccentricity’,[0,1]〉(eccentricityoftheellipse),Xar:〈‘AspectRatio’,[1,inf)〉(valuescloseto1indicateanelongatedshape),Xel:〈‘Elongation’,[0,1]〉(minimumisachievedforacircularregion),Xso:〈‘Solidity’,(0,1)〉,(howwelltheleaffitsaconvexshape),Xif:〈‘IsoperimetricFactor’,[1,inf)〉(curvyintertwinedcontoursyieldlowvalues),Xlo:〈‘Lobedness’,(0,inf)〉(characteriseshowlobedaleafis),Xmi:〈‘MaximalIndentationDepth’,(0,1)〉,(howdeeparetheindentations),Xsc:〈‘StochasticConvexity’,[0,1]〉,(probabilityofarandomsegmentinaleaftobefullycontained).

5.1.1DomainSpecificationforLeafDataset

Thevaluesofthesefeaturesforeveryobservationareacquiredfrom[206].Af-terthat,theconstructionoftheconceptualspaceisperformedwiththeinputsofthelabelledobservationsD

l,labelsetY

l,andfeaturesetF

l.Thealgorithm

firstappliesthefeaturefilteringapproach,i.e.MIFS(Algorithm3.1)topro-videarankingmatrixwhichshowsthemutualrelationsoffeaturesandlabels.Then,usingAlgorithm3.2,afeaturesubsetgroupingisperformed.Figure5.2illustratesthecreatedbipartitegraph,whichleadstodeterminingthedomainsandqualitydimensions.

Thechosenbicliques(withthehighestscores)determinethethreedomainsΔ

l={δ1,δ2,δ3},whereeachdomainisspecifiedasfollows:

•Domainδ1=〈Q(δ1),C(δ1),ωδ1〉,whereinQ(δ1)={qar,qel,qec},C(δ1)={Cia,Ctt,Cno}.

•Domainδ2=〈Q(δ2),C(δ2),ωδ2〉,whereinQ(δ2)={qso,qif},C(δ2)={Cqr,Cap}.


Figure 5.3: The conceptual space of leaf data set: a graphical presentation of thedetermined domains with the corresponding quality dimensions and concepts.

• Domain δ3 = 〈Q(δ3),C(δ3),ωδ3〉, whereinQ(δ3) = {qlo},C(δ3) = {Csa,Cap}.

Figure 5.3 depicts a graphical presentation of the determined domainswith the corresponding quality dimensions and concepts. As an exam-ple, domain δ2 is specified by two quality dimensions ‘solidity’ and‘isoperimetric factor’, and is associated with two concepts ‘Quercus’ and‘Acer’. An example of calculated weights in a domain is ωδ2(Cap,qso) = 0.61,2,which shows the salience of the relation between leaf concept ‘Acer’ and qualitydimension ‘solidity’ within δ2. Although the process of deriving the domainsis data-driven, there may be an interpretation of each specified domain. For in-stance, one can say that δ1 illustrates the convexity of the known leaves, whileδ2 shows the indentation of the known leaves (see Figure 5.3).

As an output of the domain specification phase for the conceptual spaceof leaves, the set of quality dimensions is Ql = {qar, qel, qec, qso, qif, qlo},and the set of instances is Γ l = ∪y∈Y Γ(y), where |Γ l| = |Dl|. Each γ ∈ Γ l

corresponds to a known leaf sample o ∈ Dl, and consists of three points (onein each domain).

2It is notable that the values of weights, calculating with filter method (MIFS), are not inter-pretable individually. However, they are involved in the model, since they are helpful to compareand measure the similarities between the concepts and instances.


Figure5.3:Theconceptualspaceofleafdataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts.

•Domainδ3=〈Q(δ3),C(δ3),ωδ3〉,whereinQ(δ3)={qlo},C(δ3)={Csa,Cap}.

Figure5.3depictsagraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts.Asanexam-ple,domainδ2isspecifiedbytwoqualitydimensions‘solidity’and‘isoperimetricfactor’,andisassociatedwithtwoconcepts‘Quercus’and‘Acer’.Anexampleofcalculatedweightsinadomainisωδ2(Cap,qso)=0.61,2,whichshowsthesalienceoftherelationbetweenleafconcept‘Acer’andqualitydimension‘solidity’withinδ2.Althoughtheprocessofderivingthedomainsisdata-driven,theremaybeaninterpretationofeachspecifieddomain.Forin-stance,onecansaythatδ1illustratestheconvexityoftheknownleaves,whileδ2showstheindentationoftheknownleaves(seeFigure5.3).

Asanoutputofthedomainspecificationphasefortheconceptualspaceofleaves,thesetofqualitydimensionsisQ

l={qar,qel,qec,qso,qif,qlo},

andthesetofinstancesisΓl=∪y∈YΓ(y),where|Γ

l|=|D

l|.Eachγ∈Γ

l

correspondstoaknownleafsampleo∈Dl,andconsistsofthreepoints(one

ineachdomain).

2Itisnotablethatthevaluesofweights,calculatingwithfiltermethod(MIFS),arenotinter-pretableindividually.However,theyareinvolvedinthemodel,sincetheyarehelpfultocompareandmeasurethesimilaritiesbetweentheconceptsandinstances.


Figure 5.3: The conceptual space of leaf data set: a graphical presentation of thedetermined domains with the corresponding quality dimensions and concepts.

• Domain δ3 = 〈Q(δ3),C(δ3),ωδ3〉, whereinQ(δ3) = {qlo},C(δ3) = {Csa,Cap}.

Figure 5.3 depicts a graphical presentation of the determined domainswith the corresponding quality dimensions and concepts. As an exam-ple, domain δ2 is specified by two quality dimensions ‘solidity’ and‘isoperimetric factor’, and is associated with two concepts ‘Quercus’ and‘Acer’. An example of calculated weights in a domain is ωδ2(Cap,qso) = 0.61,2,which shows the salience of the relation between leaf concept ‘Acer’ and qualitydimension ‘solidity’ within δ2. Although the process of deriving the domainsis data-driven, there may be an interpretation of each specified domain. For in-stance, one can say that δ1 illustrates the convexity of the known leaves, whileδ2 shows the indentation of the known leaves (see Figure 5.3).

As an output of the domain specification phase for the conceptual spaceof leaves, the set of quality dimensions is Ql = {qar, qel, qec, qso, qif, qlo},and the set of instances is Γ l = ∪y∈Y Γ(y), where |Γ l| = |Dl|. Each γ ∈ Γ l

corresponds to a known leaf sample o ∈ Dl, and consists of three points (onein each domain).

2It is notable that the values of weights, calculating with filter method (MIFS), are not inter-pretable individually. However, they are involved in the model, since they are helpful to compareand measure the similarities between the concepts and instances.


Figure5.3:Theconceptualspaceofleafdataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts.

•Domainδ3=〈Q(δ3),C(δ3),ωδ3〉,whereinQ(δ3)={qlo},C(δ3)={Csa,Cap}.

Figure5.3depictsagraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconcepts.Asanexam-ple,domainδ2isspecifiedbytwoqualitydimensions‘solidity’and‘isoperimetricfactor’,andisassociatedwithtwoconcepts‘Quercus’and‘Acer’.Anexampleofcalculatedweightsinadomainisωδ2(Cap,qso)=0.61,2,whichshowsthesalienceoftherelationbetweenleafconcept‘Acer’andqualitydimension‘solidity’withinδ2.Althoughtheprocessofderivingthedomainsisdata-driven,theremaybeaninterpretationofeachspecifieddomain.Forin-stance,onecansaythatδ1illustratestheconvexityoftheknownleaves,whileδ2showstheindentationoftheknownleaves(seeFigure5.3).

Asanoutputofthedomainspecificationphasefortheconceptualspaceofleaves,thesetofqualitydimensionsisQ

l={qar,qel,qec,qso,qif,qlo},

andthesetofinstancesisΓl=∪y∈YΓ(y),where|Γ

l|=|D

l|.Eachγ∈Γ

l

correspondstoaknownleafsampleo∈Dl,andconsistsofthreepoints(one

ineachdomain).

2Itisnotablethatthevaluesofweights,calculatingwithfiltermethod(MIFS),arenotinter-pretableindividually.However,theyareinvolvedinthemodel,sincetheyarehelpfultocompareandmeasurethesimilaritiesbetweentheconceptsandinstances.

5.2. SEMANTIC INFERENCE FOR UNKNOWN LEAF SAMPLES 77

5.1.2 Concept Representation for Leaf Concepts

According to the output of concept representation, each leaf concept in Cl ap-pears in only one domain and thus has exactly one sub-concept, except con-cept Cap which has two sub-concepts in two different domains. Using Algo-rithm 3.3, the elements of the sub-concepts for each known leaf concept in C iscalculated as follows.

• Leaf concepts ‘Ilex’, ‘Tilia’, and ‘Nerium’ are represented in δ1 as, re-spectively:Cia = {c1

ia : 〈η1ia,φ

1ia〉},

Ctt = {c1tt : 〈η1

tt,φ1tt〉},

Cno = {c1no : 〈η1

no,φ1no〉}.

• Leaf concept ‘Acer’ is represented in two domains δ2 and δ3 as:Cap = {c2

ap : 〈η2ap,φ

2ap〉, c3

ap : 〈η3ap,φ

3ap〉}.

• Leaf concept ‘Quercus’ is represented in δ2 as:Cqr = {c2

qr : 〈η2qr,φ

2qr〉}.

• Leaf concept ‘Salix’ is represented in δ3 as:Csa = {c3

sa : 〈η1sa,φ

1sa〉}.

In these representations, for example, η2qr shows the 2D convex polygon

of leaf concept ‘Quercus’ within δ2 (see Figure 5.3). Also, as an examplefor the weights, φ2

qr = {ωδ2(Cqr,qso), ωδ2(Cqr,qif)} shows the saliencebetween leaf concept ‘Quercus’ and two quality dimensions ‘solidity’ and‘isoperimetric factor’ within δ2. In Figure 5.3, the graphical presentation ofleaf concepts is shown by illustrating the convex hulls of their correspondingsub-concepts.

Now, with the provided elements, the conceptual space of the leaf data setis presented as: Sleaf = 〈 Ql, Δl, Cl, Γ l 〉.

5.2 Semantic Inference for Unknown Leaf Samples

Inference step aims to derive a semantic description for unknown observa-tions using the developed conceptual space. The utility of the conceptual spaceof leaves is presented using a set of unknown leaf samples from the plantspecies. Figure 5.4 presents the selected set of unknown leaf samples to berepresented. Here, It is shown how to apply the inference approach on onethe samples (e.g., leaf (a) in Figure 5.4). According to the inference process,an unknown observation like (a) is firstly vectorised to an instance γa. Thena linguistic description for a is inferred in two phases: setting symbol vectorsby inferring in conceptual space and setting the lexical items by inferring insymbol space. Instance γa is a set of points within the domains Δl(γa) de-noted by γa = {p1

γa,p2

γa,p3

γa}, where the numeric values of each point are

5.2.SEMANTICINFERENCEFORUNKNOWNLEAFSAMPLES77

5.1.2ConceptRepresentationforLeafConcepts

Accordingtotheoutputofconceptrepresentation,eachleafconceptinCl

ap-pearsinonlyonedomainandthushasexactlyonesub-concept,exceptcon-ceptCapwhichhastwosub-conceptsintwodifferentdomains.UsingAlgo-rithm3.3,theelementsofthesub-conceptsforeachknownleafconceptinCiscalculatedasfollows.

•Leafconcepts‘Ilex’,‘Tilia’,and‘Nerium’arerepresentedinδ1as,re-spectively:Cia={c1

ia:〈η1ia,φ1

ia〉},Ctt={c1

tt:〈η1tt,φ1

tt〉},Cno={c1

no:〈η1no,φ1

no〉}.•Leafconcept‘Acer’isrepresentedintwodomainsδ2andδ3as:Cap={c2

ap:〈η2ap,φ2

ap〉,c3ap:〈η3

ap,φ3ap〉}.

•Leafconcept‘Quercus’isrepresentedinδ2as:Cqr={c2

qr:〈η2qr,φ2

qr〉}.•Leafconcept‘Salix’isrepresentedinδ3as:Csa={c3

sa:〈η1sa,φ1

sa〉}.Intheserepresentations,forexample,η2

qrshowsthe2Dconvexpolygonofleafconcept‘Quercus’withinδ2(seeFigure5.3).Also,asanexamplefortheweights,φ2

qr={ωδ2(Cqr,qso),ωδ2(Cqr,qif)}showsthesaliencebetweenleafconcept‘Quercus’andtwoqualitydimensions‘solidity’and‘isoperimetricfactor’withinδ2.InFigure5.3,thegraphicalpresentationofleafconceptsisshownbyillustratingtheconvexhullsoftheircorrespondingsub-concepts.

Now,withtheprovidedelements,theconceptualspaceoftheleafdatasetispresentedas:Sleaf=〈Q

l,Δ

l,C

l,Γ

l〉.

5.2SemanticInferenceforUnknownLeafSamples

Inferencestepaimstoderiveasemanticdescriptionforunknownobserva-tionsusingthedevelopedconceptualspace.Theutilityoftheconceptualspaceofleavesispresentedusingasetofunknownleafsamplesfromtheplantspecies.Figure5.4presentstheselectedsetofunknownleafsamplestoberepresented.Here,Itisshownhowtoapplytheinferenceapproachononethesamples(e.g.,leaf(a)inFigure5.4).Accordingtotheinferenceprocess,anunknownobservationlike(a)isfirstlyvectorisedtoaninstanceγa.Thenalinguisticdescriptionforaisinferredintwophases:settingsymbolvectorsbyinferringinconceptualspaceandsettingthelexicalitemsbyinferringinsymbolspace.InstanceγaisasetofpointswithinthedomainsΔ

l(γa)de-

notedbyγa={p1γa,p2

γa,p3γa},wherethenumericvaluesofeachpointare

5.2. SEMANTIC INFERENCE FOR UNKNOWN LEAF SAMPLES 77

5.1.2 Concept Representation for Leaf Concepts

According to the output of concept representation, each leaf concept in Cl ap-pears in only one domain and thus has exactly one sub-concept, except con-cept Cap which has two sub-concepts in two different domains. Using Algo-rithm 3.3, the elements of the sub-concepts for each known leaf concept in C iscalculated as follows.

• Leaf concepts ‘Ilex’, ‘Tilia’, and ‘Nerium’ are represented in δ1 as, re-spectively:Cia = {c1

ia : 〈η1ia,φ

1ia〉},

Ctt = {c1tt : 〈η1

tt,φ1tt〉},

Cno = {c1no : 〈η1

no,φ1no〉}.

• Leaf concept ‘Acer’ is represented in two domains δ2 and δ3 as:Cap = {c2

ap : 〈η2ap,φ

2ap〉, c3

ap : 〈η3ap,φ

3ap〉}.

• Leaf concept ‘Quercus’ is represented in δ2 as:Cqr = {c2

qr : 〈η2qr,φ

2qr〉}.

• Leaf concept ‘Salix’ is represented in δ3 as:Csa = {c3

sa : 〈η1sa,φ

1sa〉}.

In these representations, for example, η2qr shows the 2D convex polygon

of leaf concept ‘Quercus’ within δ2 (see Figure 5.3). Also, as an examplefor the weights, φ2

qr = {ωδ2(Cqr,qso), ωδ2(Cqr,qif)} shows the saliencebetween leaf concept ‘Quercus’ and two quality dimensions ‘solidity’ and‘isoperimetric factor’ within δ2. In Figure 5.3, the graphical presentation ofleaf concepts is shown by illustrating the convex hulls of their correspondingsub-concepts.

Now, with the provided elements, the conceptual space of the leaf data setis presented as: Sleaf = 〈 Ql, Δl, Cl, Γ l 〉.

5.2 Semantic Inference for Unknown Leaf Samples

Inference step aims to derive a semantic description for unknown observa-tions using the developed conceptual space. The utility of the conceptual spaceof leaves is presented using a set of unknown leaf samples from the plantspecies. Figure 5.4 presents the selected set of unknown leaf samples to berepresented. Here, It is shown how to apply the inference approach on onethe samples (e.g., leaf (a) in Figure 5.4). According to the inference process,an unknown observation like (a) is firstly vectorised to an instance γa. Thena linguistic description for a is inferred in two phases: setting symbol vectorsby inferring in conceptual space and setting the lexical items by inferring insymbol space. Instance γa is a set of points within the domains Δl(γa) de-noted by γa = {p1

γa,p2

γa,p3

γa}, where the numeric values of each point are

5.2.SEMANTICINFERENCEFORUNKNOWNLEAFSAMPLES77

5.1.2ConceptRepresentationforLeafConcepts

Accordingtotheoutputofconceptrepresentation,eachleafconceptinCl

ap-pearsinonlyonedomainandthushasexactlyonesub-concept,exceptcon-ceptCapwhichhastwosub-conceptsintwodifferentdomains.UsingAlgo-rithm3.3,theelementsofthesub-conceptsforeachknownleafconceptinCiscalculatedasfollows.

•Leafconcepts‘Ilex’,‘Tilia’,and‘Nerium’arerepresentedinδ1as,re-spectively:Cia={c1

ia:〈η1ia,φ1

ia〉},Ctt={c1

tt:〈η1tt,φ1

tt〉},Cno={c1

no:〈η1no,φ1

no〉}.•Leafconcept‘Acer’isrepresentedintwodomainsδ2andδ3as:Cap={c2

ap:〈η2ap,φ2

ap〉,c3ap:〈η3

ap,φ3ap〉}.

•Leafconcept‘Quercus’isrepresentedinδ2as:Cqr={c2

qr:〈η2qr,φ2

qr〉}.•Leafconcept‘Salix’isrepresentedinδ3as:Csa={c3

sa:〈η1sa,φ1

sa〉}.Intheserepresentations,forexample,η2

qrshowsthe2Dconvexpolygonofleafconcept‘Quercus’withinδ2(seeFigure5.3).Also,asanexamplefortheweights,φ2

qr={ωδ2(Cqr,qso),ωδ2(Cqr,qif)}showsthesaliencebetweenleafconcept‘Quercus’andtwoqualitydimensions‘solidity’and‘isoperimetricfactor’withinδ2.InFigure5.3,thegraphicalpresentationofleafconceptsisshownbyillustratingtheconvexhullsoftheircorrespondingsub-concepts.

Now,withtheprovidedelements,theconceptualspaceoftheleafdatasetispresentedas:Sleaf=〈Q

l,Δ

l,C

l,Γ

l〉.

5.2SemanticInferenceforUnknownLeafSamples

Inferencestepaimstoderiveasemanticdescriptionforunknownobserva-tionsusingthedevelopedconceptualspace.Theutilityoftheconceptualspaceofleavesispresentedusingasetofunknownleafsamplesfromtheplantspecies.Figure5.4presentstheselectedsetofunknownleafsamplestoberepresented.Here,Itisshownhowtoapplytheinferenceapproachononethesamples(e.g.,leaf(a)inFigure5.4).Accordingtotheinferenceprocess,anunknownobservationlike(a)isfirstlyvectorisedtoaninstanceγa.Thenalinguisticdescriptionforaisinferredintwophases:settingsymbolvectorsbyinferringinconceptualspaceandsettingthelexicalitemsbyinferringinsymbolspace.InstanceγaisasetofpointswithinthedomainsΔ

l(γa)de-

notedbyγa={p1γa,p2

γa,p3γa},wherethenumericvaluesofeachpointare


the feature values of (a) for each quality dimension in Sleaf . For example in δ2,p2γa

=〈qso(a), qif(a)〉=〈0.86, 0.45〉.

5.2.1 Inference in Conceptual Space of Leaves

Here, it is determined whether γa is included in any defined concept’s regions,and then infer the semantic labels based on the inclusion of the instance tothe regions. Considering the leaf sample (a) in Figure 5.4, γa belongs to thesub-concept c1

tt in δ1, however, it does not belong to any sub-concept in δ2

and δ3. Thus, based on Algorithm 4.1, the symbol vector for γa is set as fol-lows: In the concept layer, using the graded membership function (defined inDefinition 4.2): Vγa,C(Ctt) = (‘Tilia Tomentosa’, 0.85). In the quality layer,for the quality dimensions of δ2 and δ3, using the graded quality function (de-fined in Definition 4.3): Vγa,Q(q

2if) = (‘tipped/toothed’, 0.75), Vγa,Q(q

2so) =

(‘smooth edges/entire’, 0.86), and Vγa,Q(q3lo) = (‘low lobedness’, 0.6).

5.2.2 Inference in Symbol Space of Leaves

By retrieving the information of the symbol vector V(γa), it is possible toverbalise the elements of symbol vectors into a set of natural language de-scriptions. As mentioned in Section 4.2.2, γa is annotated using the valuesof Vγa,C, and characterised by the values of Vγa,Q. In particular, the anno-tation set is TC(γa) = {‘Tilia Tomentosa’}, and the characterisation set willbe TQ(γa) = { ‘tipped’, ‘smooth edges’, ‘low lobedness’ }. Then the real-isation for γa is as follows: Tγa

= ‘like Tilia Tomentosa(Silver Lime),also with smooth and tipped edges, and low lobedness’.

The same approach is applicable for other unknown leaf samples (shown inFigure 5.4) to describe them in natural language. Table 5.1 presents the gener-ated descriptions derived from the semantic inference.

5.3 Empirical Evaluation: Design and Procedure for

Leaf Samples

Finding a direct solution for assessing the applicability of the proposed con-ceptual space representation is not a trivial problem. Instead, the usefulness ofthe constructed conceptual spaces for the case study of leaves is evaluated viathe linguistic descriptions derived from such spaces. The experiment evaluatesthe following aims: (1) to measure the feasibility of deriving accurate descrip-tions to distinguish unknown observations, and (2) to assess the goodness ofthe descriptions derived from conceptual spaces in comparison to the descrip-tions derived from other base-line models. To these ends, a survey has beenconducted in which the participants were asked to


thefeaturevaluesof(a)foreachqualitydimensioninSleaf.Forexampleinδ2,p2γa=〈qso(a),qif(a)〉=〈0.86,0.45〉.

5.2.1InferenceinConceptualSpaceofLeaves

Here,itisdeterminedwhetherγaisincludedinanydefinedconcept’sregions,andtheninferthesemanticlabelsbasedontheinclusionoftheinstancetotheregions.Consideringtheleafsample(a)inFigure5.4,γabelongstothesub-conceptc1

ttinδ1,however,itdoesnotbelongtoanysub-conceptinδ2

andδ3.Thus,basedonAlgorithm4.1,thesymbolvectorforγaissetasfol-lows:Intheconceptlayer,usingthegradedmembershipfunction(definedinDefinition4.2):Vγa,C(Ctt)=(‘TiliaTomentosa’,0.85).Inthequalitylayer,forthequalitydimensionsofδ2andδ3,usingthegradedqualityfunction(de-finedinDefinition4.3):Vγa,Q(q2

if)=(‘tipped/toothed’,0.75),Vγa,Q(q2so)=

(‘smoothedges/entire’,0.86),andVγa,Q(q3lo)=(‘lowlobedness’,0.6).

5.2.2InferenceinSymbolSpaceofLeaves

ByretrievingtheinformationofthesymbolvectorV(γa),itispossibletoverbalisetheelementsofsymbolvectorsintoasetofnaturallanguagede-scriptions.AsmentionedinSection4.2.2,γaisannotatedusingthevaluesofVγa,C,andcharacterisedbythevaluesofVγa,Q.Inparticular,theanno-tationsetisTC(γa)={‘TiliaTomentosa’},andthecharacterisationsetwillbeTQ(γa)={‘tipped’,‘smoothedges’,‘lowlobedness’}.Thenthereal-isationforγaisasfollows:Tγa=‘likeTiliaTomentosa(SilverLime),alsowithsmoothandtippededges,andlowlobedness’.

Thesameapproachisapplicableforotherunknownleafsamples(showninFigure5.4)todescribetheminnaturallanguage.Table5.1presentsthegener-ateddescriptionsderivedfromthesemanticinference.

5.3EmpiricalEvaluation:DesignandProcedurefor

LeafSamples

Findingadirectsolutionforassessingtheapplicabilityoftheproposedcon-ceptualspacerepresentationisnotatrivialproblem.Instead,theusefulnessoftheconstructedconceptualspacesforthecasestudyofleavesisevaluatedviathelinguisticdescriptionsderivedfromsuchspaces.Theexperimentevaluatesthefollowingaims:(1)tomeasurethefeasibilityofderivingaccuratedescrip-tionstodistinguishunknownobservations,and(2)toassessthegoodnessofthedescriptionsderivedfromconceptualspacesincomparisontothedescrip-tionsderivedfromotherbase-linemodels.Totheseends,asurveyhasbeenconductedinwhichtheparticipantswereaskedto


the feature values of (a) for each quality dimension in Sleaf . For example in δ2,p2γa

=〈qso(a), qif(a)〉=〈0.86, 0.45〉.

5.2.1 Inference in Conceptual Space of Leaves

Here, it is determined whether γa is included in any defined concept’s regions,and then infer the semantic labels based on the inclusion of the instance tothe regions. Considering the leaf sample (a) in Figure 5.4, γa belongs to thesub-concept c1

tt in δ1, however, it does not belong to any sub-concept in δ2

and δ3. Thus, based on Algorithm 4.1, the symbol vector for γa is set as fol-lows: In the concept layer, using the graded membership function (defined inDefinition 4.2): Vγa,C(Ctt) = (‘Tilia Tomentosa’, 0.85). In the quality layer,for the quality dimensions of δ2 and δ3, using the graded quality function (de-fined in Definition 4.3): Vγa,Q(q

2if) = (‘tipped/toothed’, 0.75), Vγa,Q(q

2so) =

(‘smooth edges/entire’, 0.86), and Vγa,Q(q3lo) = (‘low lobedness’, 0.6).

5.2.2 Inference in Symbol Space of Leaves

By retrieving the information of the symbol vector V(γa), it is possible toverbalise the elements of symbol vectors into a set of natural language de-scriptions. As mentioned in Section 4.2.2, γa is annotated using the valuesof Vγa,C, and characterised by the values of Vγa,Q. In particular, the anno-tation set is TC(γa) = {‘Tilia Tomentosa’}, and the characterisation set willbe TQ(γa) = { ‘tipped’, ‘smooth edges’, ‘low lobedness’ }. Then the real-isation for γa is as follows: Tγa

= ‘like Tilia Tomentosa(Silver Lime),also with smooth and tipped edges, and low lobedness’.

The same approach is applicable for other unknown leaf samples (shown inFigure 5.4) to describe them in natural language. Table 5.1 presents the gener-ated descriptions derived from the semantic inference.

5.3 Empirical Evaluation: Design and Procedure for

Leaf Samples

Finding a direct solution for assessing the applicability of the proposed con-ceptual space representation is not a trivial problem. Instead, the usefulness ofthe constructed conceptual spaces for the case study of leaves is evaluated viathe linguistic descriptions derived from such spaces. The experiment evaluatesthe following aims: (1) to measure the feasibility of deriving accurate descrip-tions to distinguish unknown observations, and (2) to assess the goodness ofthe descriptions derived from conceptual spaces in comparison to the descrip-tions derived from other base-line models. To these ends, a survey has beenconducted in which the participants were asked to


thefeaturevaluesof(a)foreachqualitydimensioninSleaf.Forexampleinδ2,p2γa=〈qso(a),qif(a)〉=〈0.86,0.45〉.

5.2.1InferenceinConceptualSpaceofLeaves

Here,itisdeterminedwhetherγaisincludedinanydefinedconcept’sregions,andtheninferthesemanticlabelsbasedontheinclusionoftheinstancetotheregions.Consideringtheleafsample(a)inFigure5.4,γabelongstothesub-conceptc1

ttinδ1,however,itdoesnotbelongtoanysub-conceptinδ2

andδ3.Thus,basedonAlgorithm4.1,thesymbolvectorforγaissetasfol-lows:Intheconceptlayer,usingthegradedmembershipfunction(definedinDefinition4.2):Vγa,C(Ctt)=(‘TiliaTomentosa’,0.85).Inthequalitylayer,forthequalitydimensionsofδ2andδ3,usingthegradedqualityfunction(de-finedinDefinition4.3):Vγa,Q(q2

if)=(‘tipped/toothed’,0.75),Vγa,Q(q2so)=

(‘smoothedges/entire’,0.86),andVγa,Q(q3lo)=(‘lowlobedness’,0.6).

5.2.2InferenceinSymbolSpaceofLeaves

ByretrievingtheinformationofthesymbolvectorV(γa),itispossibletoverbalisetheelementsofsymbolvectorsintoasetofnaturallanguagede-scriptions.AsmentionedinSection4.2.2,γaisannotatedusingthevaluesofVγa,C,andcharacterisedbythevaluesofVγa,Q.Inparticular,theanno-tationsetisTC(γa)={‘TiliaTomentosa’},andthecharacterisationsetwillbeTQ(γa)={‘tipped’,‘smoothedges’,‘lowlobedness’}.Thenthereal-isationforγaisasfollows:Tγa=‘likeTiliaTomentosa(SilverLime),alsowithsmoothandtippededges,andlowlobedness’.

Thesameapproachisapplicableforotherunknownleafsamples(showninFigure5.4)todescribetheminnaturallanguage.Table5.1presentsthegener-ateddescriptionsderivedfromthesemanticinference.

5.3EmpiricalEvaluation:DesignandProcedurefor

LeafSamples

Findingadirectsolutionforassessingtheapplicabilityoftheproposedcon-ceptualspacerepresentationisnotatrivialproblem.Instead,theusefulnessoftheconstructedconceptualspacesforthecasestudyofleavesisevaluatedviathelinguisticdescriptionsderivedfromsuchspaces.Theexperimentevaluatesthefollowingaims:(1)tomeasurethefeasibilityofderivingaccuratedescrip-tionstodistinguishunknownobservations,and(2)toassessthegoodnessofthedescriptionsderivedfromconceptualspacesincomparisontothedescrip-tionsderivedfromotherbase-linemodels.Totheseends,asurveyhasbeenconductedinwhichtheparticipantswereaskedto

5.3. EMPIRICAL EVALUATION FOR LEAF SAMPLES 79

Figure 5.4: A set of unknown leaf samples.

Leaves Linguistic DescriptionFig. 5.4(a) This leaf is like Grey Willow, but it is round with a slightly

serrated margin.

Fig. 5.4(b) This leaf is like Japanese Maple, but it is oval with lobed mar-gin.

Fig. 5.4(c) This leaf is like Silver Lime, but it is tipped with a slightlytoothed margin.

Fig. 5.4(d) This leaf is not like any known leaf species, but it is linear andelongated with entire margin.

Fig. 5.4(e) This leaf is like Grey Willow, but it is round and tipped.

Fig. 5.4(f) This leaf is not like any known leaf species, but it is oval andtipped with toothed margin.

Fig. 5.4(g) This leaf is like Silver Lime, also it is tipped with low lobed andtoothed margin.

Fig. 5.4(h) This leaf is not like any known leaf species, but it is round withtoothed and lobed margin.

Fig. 5.4(i) This leaf is like English Holly, but it is tipped with serratedmargin.

Fig. 5.4(j) This leaf is like Oleander, and it is also tipped.

Table 5.1: The linguistic descriptions derived for the unknown leaf samples inFigure 5.4.

5.3.EMPIRICALEVALUATIONFORLEAFSAMPLES79

Figure5.4:Asetofunknownleafsamples.

LeavesLinguisticDescriptionFig.5.4(a)ThisleafislikeGreyWillow,butitisroundwithaslightly

serratedmargin.

Fig.5.4(b)ThisleafislikeJapaneseMaple,butitisovalwithlobedmar-gin.

Fig.5.4(c)ThisleafislikeSilverLime,butitistippedwithaslightlytoothedmargin.

Fig.5.4(d)Thisleafisnotlikeanyknownleafspecies,butitislinearandelongatedwithentiremargin.

Fig.5.4(e)ThisleafislikeGreyWillow,butitisroundandtipped.

Fig.5.4(f)Thisleafisnotlikeanyknownleafspecies,butitisovalandtippedwithtoothedmargin.

Fig.5.4(g)ThisleafislikeSilverLime,alsoitistippedwithlowlobedandtoothedmargin.

Fig.5.4(h)Thisleafisnotlikeanyknownleafspecies,butitisroundwithtoothedandlobedmargin.

Fig.5.4(i)ThisleafislikeEnglishHolly,butitistippedwithserratedmargin.

Fig.5.4(j)ThisleafislikeOleander,anditisalsotipped.

Table5.1:ThelinguisticdescriptionsderivedfortheunknownleafsamplesinFigure5.4.


Figure 5.4: A set of unknown leaf samples.

Leaves Linguistic DescriptionFig. 5.4(a) This leaf is like Grey Willow, but it is round with a slightly

serrated margin.

Fig. 5.4(b) This leaf is like Japanese Maple, but it is oval with lobed mar-gin.

Fig. 5.4(c) This leaf is like Silver Lime, but it is tipped with a slightlytoothed margin.

Fig. 5.4(d) This leaf is not like any known leaf species, but it is linear andelongated with entire margin.

Fig. 5.4(e) This leaf is like Grey Willow, but it is round and tipped.

Fig. 5.4(f) This leaf is not like any known leaf species, but it is oval andtipped with toothed margin.

Fig. 5.4(g) This leaf is like Silver Lime, also it is tipped with low lobed andtoothed margin.

Fig. 5.4(h) This leaf is not like any known leaf species, but it is round withtoothed and lobed margin.

Fig. 5.4(i) This leaf is like English Holly, but it is tipped with serratedmargin.

Fig. 5.4(j) This leaf is like Oleander, and it is also tipped.

Table 5.1: The linguistic descriptions derived for the unknown leaf samples inFigure 5.4.


Figure5.4:Asetofunknownleafsamples.

LeavesLinguisticDescriptionFig.5.4(a)ThisleafislikeGreyWillow,butitisroundwithaslightly

serratedmargin.

Fig.5.4(b)ThisleafislikeJapaneseMaple,butitisovalwithlobedmar-gin.

Fig.5.4(c)ThisleafislikeSilverLime,butitistippedwithaslightlytoothedmargin.

Fig.5.4(d)Thisleafisnotlikeanyknownleafspecies,butitislinearandelongatedwithentiremargin.

Fig.5.4(e)ThisleafislikeGreyWillow,butitisroundandtipped.

Fig.5.4(f)Thisleafisnotlikeanyknownleafspecies,butitisovalandtippedwithtoothedmargin.

Fig.5.4(g)ThisleafislikeSilverLime,alsoitistippedwithlowlobedandtoothedmargin.

Fig.5.4(h)Thisleafisnotlikeanyknownleafspecies,butitisroundwithtoothedandlobedmargin.

Fig.5.4(i)ThisleafislikeEnglishHolly,butitistippedwithserratedmargin.

Fig.5.4(j)ThisleafislikeOleander,anditisalsotipped.

Table5.1:ThelinguisticdescriptionsderivedfortheunknownleafsamplesinFigure5.4.


1. identify specific leaf based on their linguistic description derived from theconceptual space, and

2. rate the goodness of descriptions produced by different models on a Lik-ert scale.

5.3.1 Survey: Design and Procedure

The survey was conducted online3. After an introduction to the used vocabu-lary, the participants self-evaluated their familiarity with the terminology. Themain body of the survey was composed of two parts.

The first part is designed as a set of 4 multiple choices questions whereinthe participants have been asked to read the conceptual space description of arandomly selected sample (among 15 leaves) and to choose the correspondingimage of the leaf from four shown options. The three incorrect options werealso randomly selected from a pool of unknown examples. This task-based(or extrinsic) evaluation [183] allowed evaluation process not only to measurehow well the participants can connect a description of an unknown sample toits correct image, but to investigate the incorrectly identified examples and theirrelation in the conceptual space.

For the second part, a set of rating scale questions has been designed, againusing four questions per participant. In each question, an image of one un-known observation is randomly selected and shown to the participants, alongwith the three generated descriptions for that observation from three differentmodels. Participants then are asked to rate each text from 1 to 5 (Likert-scalescoring) concerning how well each of the descriptions helps them to refer tothe image. This simultaneous human-rating (or intrinsic) evaluation [183] ofdescriptions enables the evaluation process to compare the approaches relativeto each other, as well as to evaluate the absolute goodness of each approach.

In total 207 responses have been received4, out of which 102 valid responseshave been considered for the leaf data set.The survey was publicly distributedonline to anyone who was interested in participating. However, the outcomefor both data sets showed that most of the participants were in the range of 25-44 years old and they are mostly educated in computer science or equivalent.Besides, most of the participants were fluent in English speaking. About theexpertise level of the participants, The participants were not quite familiar withthe terminology that has been used for the leaf data. From the responses, 20%of the participants knew none of the lexical items, 70% knew few or some ofthem, and just 10% knew almost all of them.

3The survey can be accessed at https://survey.bana.ee4The exact number of individuals is unknown since each participant could decide to perform

one of the data sets each time or even redo it with a new set of random questions.


1.identifyspecificleafbasedontheirlinguisticdescriptionderivedfromtheconceptualspace,and

2.ratethegoodnessofdescriptionsproducedbydifferentmodelsonaLik-ertscale.

5.3.1Survey:DesignandProcedure

Thesurveywasconductedonline3.Afteranintroductiontotheusedvocabu-lary,theparticipantsself-evaluatedtheirfamiliaritywiththeterminology.Themainbodyofthesurveywascomposedoftwoparts.

Thefirstpartisdesignedasasetof4multiplechoicesquestionswhereintheparticipantshavebeenaskedtoreadtheconceptualspacedescriptionofarandomlyselectedsample(among15leaves)andtochoosethecorrespondingimageoftheleaffromfourshownoptions.Thethreeincorrectoptionswerealsorandomlyselectedfromapoolofunknownexamples.Thistask-based(orextrinsic)evaluation[183]allowedevaluationprocessnotonlytomeasurehowwelltheparticipantscanconnectadescriptionofanunknownsampletoitscorrectimage,buttoinvestigatetheincorrectlyidentifiedexamplesandtheirrelationintheconceptualspace.

Forthesecondpart,asetofratingscalequestionshasbeendesigned,againusingfourquestionsperparticipant.Ineachquestion,animageofoneun-knownobservationisrandomlyselectedandshowntotheparticipants,alongwiththethreegenerateddescriptionsforthatobservationfromthreedifferentmodels.Participantsthenareaskedtorateeachtextfrom1to5(Likert-scalescoring)concerninghowwelleachofthedescriptionshelpsthemtorefertotheimage.Thissimultaneoushuman-rating(orintrinsic)evaluation[183]ofdescriptionsenablestheevaluationprocesstocomparetheapproachesrelativetoeachother,aswellastoevaluatetheabsolutegoodnessofeachapproach.

Intotal207responseshavebeenreceived4,outofwhich102validresponseshavebeenconsideredfortheleafdataset.Thesurveywaspubliclydistributedonlinetoanyonewhowasinterestedinparticipating.However,theoutcomeforbothdatasetsshowedthatmostoftheparticipantswereintherangeof25-44yearsoldandtheyaremostlyeducatedincomputerscienceorequivalent.Besides,mostoftheparticipantswerefluentinEnglishspeaking.Abouttheexpertiseleveloftheparticipants,Theparticipantswerenotquitefamiliarwiththeterminologythathasbeenusedfortheleafdata.Fromtheresponses,20%oftheparticipantsknewnoneofthelexicalitems,70%knewfeworsomeofthem,andjust10%knewalmostallofthem.

3Thesurveycanbeaccessedathttps://survey.bana.ee4Theexactnumberofindividualsisunknownsinceeachparticipantcoulddecidetoperform

oneofthedatasetseachtimeorevenredoitwithanewsetofrandomquestions.


1. identify specific leaf based on their linguistic description derived from theconceptual space, and


5.3.1 Survey: Design and Procedure

The survey was conducted online3. After an introduction to the used vocabu-lary, the participants self-evaluated their familiarity with the terminology. Themain body of the survey was composed of two parts.

The first part is designed as a set of 4 multiple choices questions whereinthe participants have been asked to read the conceptual space description of arandomly selected sample (among 15 leaves) and to choose the correspondingimage of the leaf from four shown options. The three incorrect options werealso randomly selected from a pool of unknown examples. This task-based(or extrinsic) evaluation [183] allowed evaluation process not only to measurehow well the participants can connect a description of an unknown sample toits correct image, but to investigate the incorrectly identified examples and theirrelation in the conceptual space.

For the second part, a set of rating scale questions has been designed, againusing four questions per participant. In each question, an image of one un-known observation is randomly selected and shown to the participants, alongwith the three generated descriptions for that observation from three differentmodels. Participants then are asked to rate each text from 1 to 5 (Likert-scalescoring) concerning how well each of the descriptions helps them to refer tothe image. This simultaneous human-rating (or intrinsic) evaluation [183] ofdescriptions enables the evaluation process to compare the approaches relativeto each other, as well as to evaluate the absolute goodness of each approach.

In total 207 responses have been received4, out of which 102 valid responseshave been considered for the leaf data set.The survey was publicly distributedonline to anyone who was interested in participating. However, the outcomefor both data sets showed that most of the participants were in the range of 25-44 years old and they are mostly educated in computer science or equivalent.Besides, most of the participants were fluent in English speaking. About theexpertise level of the participants, The participants were not quite familiar withthe terminology that has been used for the leaf data. From the responses, 20%of the participants knew none of the lexical items, 70% knew few or some ofthem, and just 10% knew almost all of them.

3The survey can be accessed at https://survey.bana.ee4The exact number of individuals is unknown since each participant could decide to perform

one of the data sets each time or even redo it with a new set of random questions.


1.identifyspecificleafbasedontheirlinguisticdescriptionderivedfromtheconceptualspace,and


5.3.1Survey:DesignandProcedure

Thesurveywasconductedonline3.Afteranintroductiontotheusedvocabu-lary,theparticipantsself-evaluatedtheirfamiliaritywiththeterminology.Themainbodyofthesurveywascomposedoftwoparts.

Thefirstpartisdesignedasasetof4multiplechoicesquestionswhereintheparticipantshavebeenaskedtoreadtheconceptualspacedescriptionofarandomlyselectedsample(among15leaves)andtochoosethecorrespondingimageoftheleaffromfourshownoptions.Thethreeincorrectoptionswerealsorandomlyselectedfromapoolofunknownexamples.Thistask-based(orextrinsic)evaluation[183]allowedevaluationprocessnotonlytomeasurehowwelltheparticipantscanconnectadescriptionofanunknownsampletoitscorrectimage,buttoinvestigatetheincorrectlyidentifiedexamplesandtheirrelationintheconceptualspace.

Forthesecondpart,asetofratingscalequestionshasbeendesigned,againusingfourquestionsperparticipant.Ineachquestion,animageofoneun-knownobservationisrandomlyselectedandshowntotheparticipants,alongwiththethreegenerateddescriptionsforthatobservationfromthreedifferentmodels.Participantsthenareaskedtorateeachtextfrom1to5(Likert-scalescoring)concerninghowwelleachofthedescriptionshelpsthemtorefertotheimage.Thissimultaneoushuman-rating(orintrinsic)evaluation[183]ofdescriptionsenablestheevaluationprocesstocomparetheapproachesrelativetoeachother,aswellastoevaluatetheabsolutegoodnessofeachapproach.

Intotal207responseshavebeenreceived4,outofwhich102validresponseshavebeenconsideredfortheleafdataset.Thesurveywaspubliclydistributedonlinetoanyonewhowasinterestedinparticipating.However,theoutcomeforbothdatasetsshowedthatmostoftheparticipantswereintherangeof25-44yearsoldandtheyaremostlyeducatedincomputerscienceorequivalent.Besides,mostoftheparticipantswerefluentinEnglishspeaking.Abouttheexpertiseleveloftheparticipants,Theparticipantswerenotquitefamiliarwiththeterminologythathasbeenusedfortheleafdata.Fromtheresponses,20%oftheparticipantsknewnoneofthelexicalitems,70%knewfeworsomeofthem,andjust10%knewalmostallofthem.

3Thesurveycanbeaccessedathttps://survey.bana.ee4Theexactnumberofindividualsisunknownsinceeachparticipantcoulddecidetoperform

oneofthedatasetseachtimeorevenredoitwithanewsetofrandomquestions.


(a) A multi choice question

(b) A rating scale question

Figure 5.5: The screenshots from two types of questions designed for the survey:(a) An example of multi choice question that shows a generated description,and it asks the participants to identify which given image of leaf (from thechoices) is related to the description, (b) An example of rating scale questionthat shows three different descriptions generated for a single leaf image, and itasks the participant to rate the goodness of each given descriptions.


(a)Amultichoicequestion

(b)Aratingscalequestion

Figure5.5:Thescreenshotsfromtwotypesofquestionsdesignedforthesurvey:(a)Anexampleofmultichoicequestionthatshowsagenerateddescription,anditaskstheparticipantstoidentifywhichgivenimageofleaf(fromthechoices)isrelatedtothedescription,(b)Anexampleofratingscalequestionthatshowsthreedifferentdescriptionsgeneratedforasingleleafimage,anditaskstheparticipanttoratethegoodnessofeachgivendescriptions.


(a) A multi choice question

(b) A rating scale question

Figure 5.5: The screenshots from two types of questions designed for the survey:(a) An example of multi choice question that shows a generated description,and it asks the participants to identify which given image of leaf (from thechoices) is related to the description, (b) An example of rating scale questionthat shows three different descriptions generated for a single leaf image, and itasks the participant to rate the goodness of each given descriptions.


(a)Amultichoicequestion

(b)Aratingscalequestion

Figure5.5:Thescreenshotsfromtwotypesofquestionsdesignedforthesurvey:(a)Anexampleofmultichoicequestionthatshowsagenerateddescription,anditaskstheparticipantstoidentifywhichgivenimageofleaf(fromthechoices)isrelatedtothedescription,(b)Anexampleofratingscalequestionthatshowsthreedifferentdescriptionsgeneratedforasingleleafimage,anditaskstheparticipanttoratethegoodnessofeachgivendescriptions.


Figure 5.6: The description of leaf (a) has been shown 31 times to the partici-pants. In 19 responses, the correct image of leaf (a) is chosen by the participants(61%). The most common misidentified example (7 responses, 23%) was theimage of leaf (x), which subjectively is quite similar, and interestingly, it is theclosest instance to the leaf (a) in the conceptual space. However, the closestinstance to leaf (a) in the full feature space is leaf (y) that its image is rarelymisidentified by the participants (only 1 response, 3%).

5.3.2 Identifying Leaf Observations from Linguistic

Descriptions

Participants were able to successfully identify all the unknown observations (15leaves) with the help of the corresponding conceptual space descriptions. Thesuccess rate to identify the correct image for each description in the leaf dataset was 73% ± 13%.

For further investigation of the incorrectly identified (i.e., misidentified) ex-amples, the geometrical similarity of these answers was calculated in order tothe correct one in the conceptual space (multi-domain). According to [86], thesimilarity in conceptual spaces can be calculated by applying Euclidean dis-tance with the domains and city-block distance between them. To assess of thesimilarities in conceptual space, the geometrical similarity of the same instancesis also calculated, but in a full feature space (single-domain) by applying Eu-clidean distance. Two interesting results have been obtained: First, the misiden-tified examples are not uniformly distributed between all possible choices, butinstead, participants tended to make similar mistakes. Second, the commonmisidentified examples are most of the times (73% leaves) the closest instanceto the correct one in the conceptual space. In the full feature space, this wasonly occasionally true (46% leaves). This shows that the confused exampleswith each other are commonly the nearest instances within the multi-domainconceptual space, which is mostly not true in the full feature space. One exam-ple regarding this outcome is illustrated in Figure 5.6.

The results from the first part of the survey show that the proposed con-ceptual space representation a) is applicable to derive semantic descriptions forunknown observations, and b) is suitable to represent the cognitively similarobservations among the multiple domains.


Figure5.6:Thedescriptionofleaf(a)hasbeenshown31timestothepartici-pants.In19responses,thecorrectimageofleaf(a)ischosenbytheparticipants(61%).Themostcommonmisidentifiedexample(7responses,23%)wastheimageofleaf(x),whichsubjectivelyisquitesimilar,andinterestingly,itistheclosestinstancetotheleaf(a)intheconceptualspace.However,theclosestinstancetoleaf(a)inthefullfeaturespaceisleaf(y)thatitsimageisrarelymisidentifiedbytheparticipants(only1response,3%).

5.3.2IdentifyingLeafObservationsfromLinguistic

Descriptions

Participantswereabletosuccessfullyidentifyalltheunknownobservations(15leaves)withthehelpofthecorrespondingconceptualspacedescriptions.Thesuccessratetoidentifythecorrectimageforeachdescriptionintheleafdatasetwas73%±13%.

Forfurtherinvestigationoftheincorrectlyidentified(i.e.,misidentified)ex-amples,thegeometricalsimilarityoftheseanswerswascalculatedinordertothecorrectoneintheconceptualspace(multi-domain).Accordingto[86],thesimilarityinconceptualspacescanbecalculatedbyapplyingEuclideandis-tancewiththedomainsandcity-blockdistancebetweenthem.Toassessofthesimilaritiesinconceptualspace,thegeometricalsimilarityofthesameinstancesisalsocalculated,butinafullfeaturespace(single-domain)byapplyingEu-clideandistance.Twointerestingresultshavebeenobtained:First,themisiden-tifiedexamplesarenotuniformlydistributedbetweenallpossiblechoices,butinstead,participantstendedtomakesimilarmistakes.Second,thecommonmisidentifiedexamplesaremostofthetimes(73%leaves)theclosestinstancetothecorrectoneintheconceptualspace.Inthefullfeaturespace,thiswasonlyoccasionallytrue(46%leaves).Thisshowsthattheconfusedexampleswitheachotherarecommonlythenearestinstanceswithinthemulti-domainconceptualspace,whichismostlynottrueinthefullfeaturespace.Oneexam-pleregardingthisoutcomeisillustratedinFigure5.6.

Theresultsfromthefirstpartofthesurveyshowthattheproposedcon-ceptualspacerepresentationa)isapplicabletoderivesemanticdescriptionsforunknownobservations,andb)issuitabletorepresentthecognitivelysimilarobservationsamongthemultipledomains.


Figure 5.6: The description of leaf (a) has been shown 31 times to the partici-pants. In 19 responses, the correct image of leaf (a) is chosen by the participants(61%). The most common misidentified example (7 responses, 23%) was theimage of leaf (x), which subjectively is quite similar, and interestingly, it is theclosest instance to the leaf (a) in the conceptual space. However, the closestinstance to leaf (a) in the full feature space is leaf (y) that its image is rarelymisidentified by the participants (only 1 response, 3%).

5.3.2 Identifying Leaf Observations from Linguistic

Descriptions

Participants were able to successfully identify all the unknown observations (15leaves) with the help of the corresponding conceptual space descriptions. Thesuccess rate to identify the correct image for each description in the leaf dataset was 73% ± 13%.

For further investigation of the incorrectly identified (i.e., misidentified) ex-amples, the geometrical similarity of these answers was calculated in order tothe correct one in the conceptual space (multi-domain). According to [86], thesimilarity in conceptual spaces can be calculated by applying Euclidean dis-tance with the domains and city-block distance between them. To assess of thesimilarities in conceptual space, the geometrical similarity of the same instancesis also calculated, but in a full feature space (single-domain) by applying Eu-clidean distance. Two interesting results have been obtained: First, the misiden-tified examples are not uniformly distributed between all possible choices, butinstead, participants tended to make similar mistakes. Second, the commonmisidentified examples are most of the times (73% leaves) the closest instanceto the correct one in the conceptual space. In the full feature space, this wasonly occasionally true (46% leaves). This shows that the confused exampleswith each other are commonly the nearest instances within the multi-domainconceptual space, which is mostly not true in the full feature space. One exam-ple regarding this outcome is illustrated in Figure 5.6.

The results from the first part of the survey show that the proposed con-ceptual space representation a) is applicable to derive semantic descriptions forunknown observations, and b) is suitable to represent the cognitively similarobservations among the multiple domains.


Figure5.6:Thedescriptionofleaf(a)hasbeenshown31timestothepartici-pants.In19responses,thecorrectimageofleaf(a)ischosenbytheparticipants(61%).Themostcommonmisidentifiedexample(7responses,23%)wastheimageofleaf(x),whichsubjectivelyisquitesimilar,andinterestingly,itistheclosestinstancetotheleaf(a)intheconceptualspace.However,theclosestinstancetoleaf(a)inthefullfeaturespaceisleaf(y)thatitsimageisrarelymisidentifiedbytheparticipants(only1response,3%).

5.3.2IdentifyingLeafObservationsfromLinguistic

Descriptions

Participantswereabletosuccessfullyidentifyalltheunknownobservations(15leaves)withthehelpofthecorrespondingconceptualspacedescriptions.Thesuccessratetoidentifythecorrectimageforeachdescriptionintheleafdatasetwas73%±13%.

Forfurtherinvestigationoftheincorrectlyidentified(i.e.,misidentified)ex-amples,thegeometricalsimilarityoftheseanswerswascalculatedinordertothecorrectoneintheconceptualspace(multi-domain).Accordingto[86],thesimilarityinconceptualspacescanbecalculatedbyapplyingEuclideandis-tancewiththedomainsandcity-blockdistancebetweenthem.Toassessofthesimilaritiesinconceptualspace,thegeometricalsimilarityofthesameinstancesisalsocalculated,butinafullfeaturespace(single-domain)byapplyingEu-clideandistance.Twointerestingresultshavebeenobtained:First,themisiden-tifiedexamplesarenotuniformlydistributedbetweenallpossiblechoices,butinstead,participantstendedtomakesimilarmistakes.Second,thecommonmisidentifiedexamplesaremostofthetimes(73%leaves)theclosestinstancetothecorrectoneintheconceptualspace.Inthefullfeaturespace,thiswasonlyoccasionallytrue(46%leaves).Thisshowsthattheconfusedexampleswitheachotherarecommonlythenearestinstanceswithinthemulti-domainconceptualspace,whichismostlynottrueinthefullfeaturespace.Oneexam-pleregardingthisoutcomeisillustratedinFigure5.6.

Theresultsfromthefirstpartofthesurveyshowthattheproposedcon-ceptualspacerepresentationa)isapplicabletoderivesemanticdescriptionsforunknownobservations,andb)issuitabletorepresentthecognitivelysimilarobservationsamongthemultipledomains.


5.3.3 Rating Various Linguistic Descriptions of a Leaf

Observation

In the results of the rating scale questions, the description derived from theconceptual space model is compared with the descriptions derived from thetwo models of concept formation within the full feature space, one using a gen-erative model and the other a discriminative model [122, 162]. The generativemodel forms the concepts by modelling the distribution of individual classes(i.e., bound each of them with a convex region). Then a new observation eitherbelongs to an existing class or none of them. On the other hand, the discrimina-tive model forms the concepts by learning the (hard or soft) boundary betweenclasses (i.e., divide them into Voronoi regions). Hence, a new observation al-ways belongs to at least one class label.

Concerning these two models of concept formation, two base-line ap-proaches have been developed to generate linguistic descriptions. Inspired bythe idea of fuzzy sets for the linguistic description of data [177, 180], the gen-erative and discriminative models have been extended to quantify the inclusionof new observations within the full feature space. This will allow the modelto verbalise the numeric output of the models with linguistic descriptions. In agenerative model, fuzzy sets are employed to quantify the numeric values of thefeatures within the full-feature space. the same semantic inference algorithm isapplied (Chapter 4) on this model to derive such descriptions that most likelyinvolves only the quantified terms of the characteristic features. In a discrimi-native model, with the help of fuzzy sets, the model is extended to multi-labelclassification [211], which quantifies the membership of the instances to theconcepts. The inference algorithm is applied to this model to derive descrip-tions that involve only the assigned labels of concepts with a quantification oftheir membership degrees.

Example 5.1. For leaf (a) in Figure 5.4, here are the output descriptions fromthree various approaches:

• Conceptual: “This leaf is like Grey Willow, but it is round with slightlyserrated margin.”

• Generative: “This leaf is round, wide, connected and entire, with smoothmargin and no indentation.”

• Discriminative: “This leaf is similar to English Holly, also has some sim-ilarity to Grey Willow.”

Table 5.2 shows the statistical summary of the rating scores received for thedescriptions derived from each of the approaches in leaf data set. Also, thesescores are depicted in the form of boxplots in Figure 5.6.

An ANOVA test has been applied to show that the conceptual space de-scription (Conceptual) is significantly preferable rated than the two alterna-tives (Generative and Discriminative). Here, the Likert-scale scores were used


5.3.3RatingVariousLinguisticDescriptionsofaLeaf

Observation

Intheresultsoftheratingscalequestions,thedescriptionderivedfromtheconceptualspacemodeliscomparedwiththedescriptionsderivedfromthetwomodelsofconceptformationwithinthefullfeaturespace,oneusingagen-erativemodelandtheotheradiscriminativemodel[122,162].Thegenerativemodelformstheconceptsbymodellingthedistributionofindividualclasses(i.e.,boundeachofthemwithaconvexregion).Thenanewobservationeitherbelongstoanexistingclassornoneofthem.Ontheotherhand,thediscrimina-tivemodelformstheconceptsbylearningthe(hardorsoft)boundarybetweenclasses(i.e.,dividethemintoVoronoiregions).Hence,anewobservational-waysbelongstoatleastoneclasslabel.

Concerningthesetwomodelsofconceptformation,twobase-lineap-proacheshavebeendevelopedtogeneratelinguisticdescriptions.Inspiredbytheideaoffuzzysetsforthelinguisticdescriptionofdata[177,180],thegen-erativeanddiscriminativemodelshavebeenextendedtoquantifytheinclusionofnewobservationswithinthefullfeaturespace.Thiswillallowthemodeltoverbalisethenumericoutputofthemodelswithlinguisticdescriptions.Inagenerativemodel,fuzzysetsareemployedtoquantifythenumericvaluesofthefeatureswithinthefull-featurespace.thesamesemanticinferencealgorithmisapplied(Chapter4)onthismodeltoderivesuchdescriptionsthatmostlikelyinvolvesonlythequantifiedtermsofthecharacteristicfeatures.Inadiscrimi-nativemodel,withthehelpoffuzzysets,themodelisextendedtomulti-labelclassification[211],whichquantifiesthemembershipoftheinstancestotheconcepts.Theinferencealgorithmisappliedtothismodeltoderivedescrip-tionsthatinvolveonlytheassignedlabelsofconceptswithaquantificationoftheirmembershipdegrees.

Example5.1.Forleaf(a)inFigure5.4,herearetheoutputdescriptionsfromthreevariousapproaches:

•Conceptual:“ThisleafislikeGreyWillow,butitisroundwithslightlyserratedmargin.”

•Generative:“Thisleafisround,wide,connectedandentire,withsmoothmarginandnoindentation.”

•Discriminative:“ThisleafissimilartoEnglishHolly,alsohassomesim-ilaritytoGreyWillow.”

Table5.2showsthestatisticalsummaryoftheratingscoresreceivedforthedescriptionsderivedfromeachoftheapproachesinleafdataset.Also,thesescoresaredepictedintheformofboxplotsinFigure5.6.

AnANOVAtesthasbeenappliedtoshowthattheconceptualspacede-scription(Conceptual)issignificantlypreferableratedthanthetwoalterna-tives(GenerativeandDiscriminative).Here,theLikert-scalescoreswereused


5.3.3 Rating Various Linguistic Descriptions of a Leaf

Observation

In the results of the rating scale questions, the description derived from theconceptual space model is compared with the descriptions derived from thetwo models of concept formation within the full feature space, one using a gen-erative model and the other a discriminative model [122, 162]. The generativemodel forms the concepts by modelling the distribution of individual classes(i.e., bound each of them with a convex region). Then a new observation eitherbelongs to an existing class or none of them. On the other hand, the discrimina-tive model forms the concepts by learning the (hard or soft) boundary betweenclasses (i.e., divide them into Voronoi regions). Hence, a new observation al-ways belongs to at least one class label.

Concerning these two models of concept formation, two base-line ap-proaches have been developed to generate linguistic descriptions. Inspired bythe idea of fuzzy sets for the linguistic description of data [177, 180], the gen-erative and discriminative models have been extended to quantify the inclusionof new observations within the full feature space. This will allow the modelto verbalise the numeric output of the models with linguistic descriptions. In agenerative model, fuzzy sets are employed to quantify the numeric values of thefeatures within the full-feature space. the same semantic inference algorithm isapplied (Chapter 4) on this model to derive such descriptions that most likelyinvolves only the quantified terms of the characteristic features. In a discrimi-native model, with the help of fuzzy sets, the model is extended to multi-labelclassification [211], which quantifies the membership of the instances to theconcepts. The inference algorithm is applied to this model to derive descrip-tions that involve only the assigned labels of concepts with a quantification oftheir membership degrees.

Example 5.1. For leaf (a) in Figure 5.4, here are the output descriptions fromthree various approaches:

• Conceptual: “This leaf is like Grey Willow, but it is round with slightlyserrated margin.”

• Generative: “This leaf is round, wide, connected and entire, with smoothmargin and no indentation.”

• Discriminative: “This leaf is similar to English Holly, also has some sim-ilarity to Grey Willow.”

Table 5.2 shows the statistical summary of the rating scores received for thedescriptions derived from each of the approaches in leaf data set. Also, thesescores are depicted in the form of boxplots in Figure 5.6.

An ANOVA test has been applied to show that the conceptual space de-scription (Conceptual) is significantly preferable rated than the two alterna-tives (Generative and Discriminative). Here, the Likert-scale scores were used


5.3.3RatingVariousLinguisticDescriptionsofaLeaf

Observation

Intheresultsoftheratingscalequestions,thedescriptionderivedfromtheconceptualspacemodeliscomparedwiththedescriptionsderivedfromthetwomodelsofconceptformationwithinthefullfeaturespace,oneusingagen-erativemodelandtheotheradiscriminativemodel[122,162].Thegenerativemodelformstheconceptsbymodellingthedistributionofindividualclasses(i.e.,boundeachofthemwithaconvexregion).Thenanewobservationeitherbelongstoanexistingclassornoneofthem.Ontheotherhand,thediscrimina-tivemodelformstheconceptsbylearningthe(hardorsoft)boundarybetweenclasses(i.e.,dividethemintoVoronoiregions).Hence,anewobservational-waysbelongstoatleastoneclasslabel.

Concerningthesetwomodelsofconceptformation,twobase-lineap-proacheshavebeendevelopedtogeneratelinguisticdescriptions.Inspiredbytheideaoffuzzysetsforthelinguisticdescriptionofdata[177,180],thegen-erativeanddiscriminativemodelshavebeenextendedtoquantifytheinclusionofnewobservationswithinthefullfeaturespace.Thiswillallowthemodeltoverbalisethenumericoutputofthemodelswithlinguisticdescriptions.Inagenerativemodel,fuzzysetsareemployedtoquantifythenumericvaluesofthefeatureswithinthefull-featurespace.thesamesemanticinferencealgorithmisapplied(Chapter4)onthismodeltoderivesuchdescriptionsthatmostlikelyinvolvesonlythequantifiedtermsofthecharacteristicfeatures.Inadiscrimi-nativemodel,withthehelpoffuzzysets,themodelisextendedtomulti-labelclassification[211],whichquantifiesthemembershipoftheinstancestotheconcepts.Theinferencealgorithmisappliedtothismodeltoderivedescrip-tionsthatinvolveonlytheassignedlabelsofconceptswithaquantificationoftheirmembershipdegrees.

Example5.1.Forleaf(a)inFigure5.4,herearetheoutputdescriptionsfromthreevariousapproaches:

•Conceptual:“ThisleafislikeGreyWillow,butitisroundwithslightlyserratedmargin.”

•Generative:“Thisleafisround,wide,connectedandentire,withsmoothmarginandnoindentation.”

•Discriminative:“ThisleafissimilartoEnglishHolly,alsohassomesim-ilaritytoGreyWillow.”

Table5.2showsthestatisticalsummaryoftheratingscoresreceivedforthedescriptionsderivedfromeachoftheapproachesinleafdataset.Also,thesescoresaredepictedintheformofboxplotsinFigure5.6.

AnANOVAtesthasbeenappliedtoshowthattheconceptualspacede-scription(Conceptual)issignificantlypreferableratedthanthetwoalterna-tives(GenerativeandDiscriminative).Here,theLikert-scalescoreswereused


Table 5.2: The overall scores calculated from rating responses to the differentmodels in leaf data set. The numbers show average scores (and standard devi-ations) in the range of 1 to 5.

Mean (SD)Conceptual 3.62 (1.19)Generative 3.27 (1.33)

Discriminative 2.72 (1.24)

Figure 5.7: The box plot of the rating scores received for each of the models,deriving descriptions in leaf data set.

as the dependent variable, and the models were used as the independent vari-ables (groups). Then, the Post-hoc Tukey test was employed to identify thesignificant difference between the models. The one-way ANOVA test showed asignificant effect of the models on the scores. For leaf data set, Conceptual hasthe mean significantly different from Generative and Discriminative, p < .0001(two-tailed). The details of the test has been shown in Table 5.3.

Moreover, since the ratings are ordinal, also a non-parametric test (i.e.,Wilcoxon Test) has been carried out to identify the significant differences be-tween ratings by comparing each pair of the scores. Table 5.3 shows the p-values of this method for each pair of models. The output showed that Concep-tual model is significantly different from Generative and Discriminative models(p < .0001).

Overall, the results from this part of the survey show that the proposedconceptual space representation a) is an appropriate semantic inference model


Table5.2:Theoverallscorescalculatedfromratingresponsestothedifferentmodelsinleafdataset.Thenumbersshowaveragescores(andstandarddevi-ations)intherangeof1to5.

Mean(SD)Conceptual3.62(1.19)Generative3.27(1.33)

Discriminative2.72(1.24)

Figure5.7:Theboxplotoftheratingscoresreceivedforeachofthemodels,derivingdescriptionsinleafdataset.

asthedependentvariable,andthemodelswereusedastheindependentvari-ables(groups).Then,thePost-hocTukeytestwasemployedtoidentifythesignificantdifferencebetweenthemodels.Theone-wayANOVAtestshowedasignificanteffectofthemodelsonthescores.Forleafdataset,ConceptualhasthemeansignificantlydifferentfromGenerativeandDiscriminative,p<.0001(two-tailed).ThedetailsofthetesthasbeenshowninTable5.3.

Moreover,sincetheratingsareordinal,alsoanon-parametrictest(i.e.,WilcoxonTest)hasbeencarriedouttoidentifythesignificantdifferencesbe-tweenratingsbycomparingeachpairofthescores.Table5.3showsthep-valuesofthismethodforeachpairofmodels.TheoutputshowedthatConcep-tualmodelissignificantlydifferentfromGenerativeandDiscriminativemodels(p<.0001).

Overall,theresultsfromthispartofthesurveyshowthattheproposedconceptualspacerepresentationa)isanappropriatesemanticinferencemodel


Table 5.2: The overall scores calculated from rating responses to the differentmodels in leaf data set. The numbers show average scores (and standard devi-ations) in the range of 1 to 5.



Figure 5.7: The box plot of the rating scores received for each of the models,deriving descriptions in leaf data set.

as the dependent variable, and the models were used as the independent vari-ables (groups). Then, the Post-hoc Tukey test was employed to identify thesignificant difference between the models. The one-way ANOVA test showed asignificant effect of the models on the scores. For leaf data set, Conceptual hasthe mean significantly different from Generative and Discriminative, p < .0001(two-tailed). The details of the test has been shown in Table 5.3.

Moreover, since the ratings are ordinal, also a non-parametric test (i.e.,Wilcoxon Test) has been carried out to identify the significant differences be-tween ratings by comparing each pair of the scores. Table 5.3 shows the p-values of this method for each pair of models. The output showed that Concep-tual model is significantly different from Generative and Discriminative models(p < .0001).

Overall, the results from this part of the survey show that the proposedconceptual space representation a) is an appropriate semantic inference model


Table5.2:Theoverallscorescalculatedfromratingresponsestothedifferentmodelsinleafdataset.Thenumbersshowaveragescores(andstandarddevi-ations)intherangeof1to5.



Figure5.7:Theboxplotoftheratingscoresreceivedforeachofthemodels,derivingdescriptionsinleafdataset.

asthedependentvariable,andthemodelswereusedastheindependentvari-ables(groups).Then,thePost-hocTukeytestwasemployedtoidentifythesignificantdifferencebetweenthemodels.Theone-wayANOVAtestshowedasignificanteffectofthemodelsonthescores.Forleafdataset,ConceptualhasthemeansignificantlydifferentfromGenerativeandDiscriminative,p<.0001(two-tailed).ThedetailsofthetesthasbeenshowninTable5.3.

Moreover,sincetheratingsareordinal,alsoanon-parametrictest(i.e.,WilcoxonTest)hasbeencarriedouttoidentifythesignificantdifferencesbe-tweenratingsbycomparingeachpairofthescores.Table5.3showsthep-valuesofthismethodforeachpairofmodels.TheoutputshowedthatConcep-tualmodelissignificantlydifferentfromGenerativeandDiscriminativemodels(p<.0001).

Overall,theresultsfromthispartofthesurveyshowthattheproposedconceptualspacerepresentationa)isanappropriatesemanticinferencemodel

5.4. DISCUSSION 85

Table 5.3: Summary of the one-way ANOVA and Wilcoxon tests for the ratingscores with respect to the models deriving descriptions.

leaf data set

ANOVA TestConceptual vs.

Generative & DiscriminativeF(2, 1221)=52.82,

p<10−21

Wilcoxon TestConceptual vs. Generative p<10−4

Conceptual vs. Discriminative p<10−23

Generative vs. Discriminative p<10−07

to derive linguistic descriptions for unknown observations, and b) successfullyderives descriptions (from multi-domain space) that are naturally preferred byparticipants, in comparison to the other alternative models (from single-domainspace).

5.4 Discussion

In this chapter, the feasibility of the approach has been demonstrated in a casestudy of the real-world numerical data set. But it is notable that the frameworkproposed in this work is introduced for general use in any numeric data setswherein the known features and categories are available, but the perceptualdomains and the concept formation need to be determined. By performing anempirical evaluation, it has been assessed how well linguistic descriptions thatare generated based on the derived conceptual space enable human users firstto identify the unknown observations. This followed by a comparison betweendifferent semantic models of generating linguistic descriptions to show howwell human users prefer the descriptions from the conceptual space model torefer to unknown observations. Following, a number of issues related to theinference approach are discussed.

Multi-domain representation: The evaluation results indicate that a multi-domain representation of concepts (i.e., conceptual spaces) can lead to a betterpresentation of output descriptions in comparison to a single-domain represen-tation, since the multi-domain spaces preserve the various semantic aspects ofthe attributes for a concept, while the others combine all the attributes into asingle space.

Generalisation: While the proposed approach has been tested on two casestudies to verify its plausibility, it would be necessary to identify a general classof problems that the proposed approach can address. Many AI problems deal-ing with numeric data as the input of learning systems require semantic inter-pretation for these data, which is needed for interaction with humans. How-ever, in most of the cases, there is no a priori or expert knowledge to explain the

5.4.DISCUSSION85

Table5.3:Summaryoftheone-wayANOVAandWilcoxontestsfortheratingscoreswithrespecttothemodelsderivingdescriptions.

leafdataset

ANOVATestConceptualvs.

Generative&DiscriminativeF(2,1221)=52.82,

p<10−21

WilcoxonTestConceptualvs.Generativep<10

−4

Conceptualvs.Discriminativep<10−23

Generativevs.Discriminativep<10−07

toderivelinguisticdescriptionsforunknownobservations,andb)successfullyderivesdescriptions(frommulti-domainspace)thatarenaturallypreferredbyparticipants,incomparisontotheotheralternativemodels(fromsingle-domainspace).

5.4Discussion

Inthischapter,thefeasibilityoftheapproachhasbeendemonstratedinacasestudyofthereal-worldnumericaldataset.Butitisnotablethattheframeworkproposedinthisworkisintroducedforgeneraluseinanynumericdatasetswhereintheknownfeaturesandcategoriesareavailable,buttheperceptualdomainsandtheconceptformationneedtobedetermined.Byperforminganempiricalevaluation,ithasbeenassessedhowwelllinguisticdescriptionsthataregeneratedbasedonthederivedconceptualspaceenablehumanusersfirsttoidentifytheunknownobservations.Thisfollowedbyacomparisonbetweendifferentsemanticmodelsofgeneratinglinguisticdescriptionstoshowhowwellhumanuserspreferthedescriptionsfromtheconceptualspacemodeltorefertounknownobservations.Following,anumberofissuesrelatedtotheinferenceapproacharediscussed.

Multi-domainrepresentation:Theevaluationresultsindicatethatamulti-domainrepresentationofconcepts(i.e.,conceptualspaces)canleadtoabetterpresentationofoutputdescriptionsincomparisontoasingle-domainrepresen-tation,sincethemulti-domainspacespreservethevarioussemanticaspectsoftheattributesforaconcept,whiletheotherscombinealltheattributesintoasinglespace.

Generalisation:Whiletheproposedapproachhasbeentestedontwocasestudiestoverifyitsplausibility,itwouldbenecessarytoidentifyageneralclassofproblemsthattheproposedapproachcanaddress.ManyAIproblemsdeal-ingwithnumericdataastheinputoflearningsystemsrequiresemanticinter-pretationforthesedata,whichisneededforinteractionwithhumans.How-ever,inmostofthecases,thereisnoaprioriorexpertknowledgetoexplainthe

5.4. DISCUSSION 85


leaf data set


Generative & DiscriminativeF(2, 1221)=52.82,

p<10−21

Wilcoxon TestConceptual vs. Generative p<10−4

Conceptual vs. Discriminative p<10−23

Generative vs. Discriminative p<10−07

to derive linguistic descriptions for unknown observations, and b) successfullyderives descriptions (from multi-domain space) that are naturally preferred byparticipants, in comparison to the other alternative models (from single-domainspace).

5.4 Discussion

In this chapter, the feasibility of the approach has been demonstrated in a casestudy of the real-world numerical data set. But it is notable that the frameworkproposed in this work is introduced for general use in any numeric data setswherein the known features and categories are available, but the perceptualdomains and the concept formation need to be determined. By performing anempirical evaluation, it has been assessed how well linguistic descriptions thatare generated based on the derived conceptual space enable human users firstto identify the unknown observations. This followed by a comparison betweendifferent semantic models of generating linguistic descriptions to show howwell human users prefer the descriptions from the conceptual space model torefer to unknown observations. Following, a number of issues related to theinference approach are discussed.

Multi-domain representation: The evaluation results indicate that a multi-domain representation of concepts (i.e., conceptual spaces) can lead to a betterpresentation of output descriptions in comparison to a single-domain represen-tation, since the multi-domain spaces preserve the various semantic aspects ofthe attributes for a concept, while the others combine all the attributes into asingle space.

Generalisation: While the proposed approach has been tested on two casestudies to verify its plausibility, it would be necessary to identify a general classof problems that the proposed approach can address. Many AI problems deal-ing with numeric data as the input of learning systems require semantic inter-pretation for these data, which is needed for interaction with humans. How-ever, in most of the cases, there is no a priori or expert knowledge to explain the

5.4.DISCUSSION85


leafdataset



p<10−21


−4


Generativevs.Discriminativep<10−07

toderivelinguisticdescriptionsforunknownobservations,andb)successfullyderivesdescriptions(frommulti-domainspace)thatarenaturallypreferredbyparticipants,incomparisontotheotheralternativemodels(fromsingle-domainspace).

5.4Discussion

Inthischapter,thefeasibilityoftheapproachhasbeendemonstratedinacasestudyofthereal-worldnumericaldataset.Butitisnotablethattheframeworkproposedinthisworkisintroducedforgeneraluseinanynumericdatasetswhereintheknownfeaturesandcategoriesareavailable,buttheperceptualdomainsandtheconceptformationneedtobedetermined.Byperforminganempiricalevaluation,ithasbeenassessedhowwelllinguisticdescriptionsthataregeneratedbasedonthederivedconceptualspaceenablehumanusersfirsttoidentifytheunknownobservations.Thisfollowedbyacomparisonbetweendifferentsemanticmodelsofgeneratinglinguisticdescriptionstoshowhowwellhumanuserspreferthedescriptionsfromtheconceptualspacemodeltorefertounknownobservations.Following,anumberofissuesrelatedtotheinferenceapproacharediscussed.

Multi-domainrepresentation:Theevaluationresultsindicatethatamulti-domainrepresentationofconcepts(i.e.,conceptualspaces)canleadtoabetterpresentationofoutputdescriptionsincomparisontoasingle-domainrepresen-tation,sincethemulti-domainspacespreservethevarioussemanticaspectsoftheattributesforaconcept,whiletheotherscombinealltheattributesintoasinglespace.

Generalisation:Whiletheproposedapproachhasbeentestedontwocasestudiestoverifyitsplausibility,itwouldbenecessarytoidentifyageneralclassofproblemsthattheproposedapproachcanaddress.ManyAIproblemsdeal-ingwithnumericdataastheinputoflearningsystemsrequiresemanticinter-pretationforthesedata,whichisneededforinteractionwithhumans.How-ever,inmostofthecases,thereisnoaprioriorexpertknowledgetoexplainthe


aspects of the input observations. This is more problematic when connection-ist approaches are applied since they cannot explain what the learnt emergingmodel represents. Hence, the introduced framework to construct and utiliseconceptual spaces is generally applicable for those AI applications wherein: (1)there is a need for concept learning and concept description only based on theknown available observations, and (2) there is a lack of interpretability whilecreating a learning model, as well as the lack of explainability while testing themodel by completely unknown observations.

In sum, Part I of this thesis is concluded with the investigation on how thesemantic representations proposed in Chapter 3 and Chapter 4 can be appliedto a real-world data set, and the utility of this representation for describing newobservations.


aspectsoftheinputobservations.Thisismoreproblematicwhenconnection-istapproachesareappliedsincetheycannotexplainwhatthelearntemergingmodelrepresents.Hence,theintroducedframeworktoconstructandutiliseconceptualspacesisgenerallyapplicableforthoseAIapplicationswherein:(1)thereisaneedforconceptlearningandconceptdescriptiononlybasedontheknownavailableobservations,and(2)thereisalackofinterpretabilitywhilecreatingalearningmodel,aswellasthelackofexplainabilitywhiletestingthemodelbycompletelyunknownobservations.

Insum,PartIofthisthesisisconcludedwiththeinvestigationonhowthesemanticrepresentationsproposedinChapter3andChapter4canbeappliedtoareal-worlddataset,andtheutilityofthisrepresentationfordescribingnewobservations.


aspects of the input observations. This is more problematic when connection-ist approaches are applied since they cannot explain what the learnt emergingmodel represents. Hence, the introduced framework to construct and utiliseconceptual spaces is generally applicable for those AI applications wherein: (1)there is a need for concept learning and concept description only based on theknown available observations, and (2) there is a lack of interpretability whilecreating a learning model, as well as the lack of explainability while testing themodel by completely unknown observations.

In sum, Part I of this thesis is concluded with the investigation on how thesemantic representations proposed in Chapter 3 and Chapter 4 can be appliedto a real-world data set, and the utility of this representation for describing newobservations.


aspectsoftheinputobservations.Thisismoreproblematicwhenconnection-istapproachesareappliedsincetheycannotexplainwhatthelearntemergingmodelrepresents.Hence,theintroducedframeworktoconstructandutiliseconceptualspacesisgenerallyapplicableforthoseAIapplicationswherein:(1)thereisaneedforconceptlearningandconceptdescriptiononlybasedontheknownavailableobservations,and(2)thereisalackofinterpretabilitywhilecreatingalearningmodel,aswellasthelackofexplainabilitywhiletestingthemodelbycompletelyunknownobservations.

Insum,PartIofthisthesisisconcludedwiththeinvestigationonhowthesemanticrepresentationsproposedinChapter3andChapter4canbeappliedtoareal-worlddataset,andtheutilityofthisrepresentationfordescribingnewobservations.

Part II

Physiological Sensor Data:From Data Analysis toLinguistic Descriptions

Part II of this thesis focuses on the full framework of describing the numericaldata, specifically considering physiological sensor data. This part shows how tomine physiological patterns from sensor data, and how to use the semantic rep-resentations proposed in Part I to linguistically describe such patterns. First, thestate of the art on data mining approaches for sensor data is discussed (Chap-ter 6). After mining unseen but interesting time series patterns (Chapter 7) andtemporal rules between the patterns (Chapter 8), it is shown how the proposedsemantic model in Part I can be applied to linguistically describe these patterns(Chapter 9).

89

PartII

PhysiologicalSensorData:FromDataAnalysistoLinguisticDescriptions

PartIIofthisthesisfocusesonthefullframeworkofdescribingthenumericaldata,specificallyconsideringphysiologicalsensordata.Thispartshowshowtominephysiologicalpatternsfromsensordata,andhowtousethesemanticrep-resentationsproposedinPartItolinguisticallydescribesuchpatterns.First,thestateoftheartondataminingapproachesforsensordataisdiscussed(Chap-ter6).Afterminingunseenbutinterestingtimeseriespatterns(Chapter7)andtemporalrulesbetweenthepatterns(Chapter8),itisshownhowtheproposedsemanticmodelinPartIcanbeappliedtolinguisticallydescribethesepatterns(Chapter9).

89

Part II

Physiological Sensor Data:From Data Analysis toLinguistic Descriptions

Part II of this thesis focuses on the full framework of describing the numericaldata, specifically considering physiological sensor data. This part shows how tomine physiological patterns from sensor data, and how to use the semantic rep-resentations proposed in Part I to linguistically describe such patterns. First, thestate of the art on data mining approaches for sensor data is discussed (Chap-ter 6). After mining unseen but interesting time series patterns (Chapter 7) andtemporal rules between the patterns (Chapter 8), it is shown how the proposedsemantic model in Part I can be applied to linguistically describe these patterns(Chapter 9).

89

PartII

PhysiologicalSensorData:FromDataAnalysistoLinguisticDescriptions

PartIIofthisthesisfocusesonthefullframeworkofdescribingthenumericaldata,specificallyconsideringphysiologicalsensordata.Thispartshowshowtominephysiologicalpatternsfromsensordata,andhowtousethesemanticrep-resentationsproposedinPartItolinguisticallydescribesuchpatterns.First,thestateoftheartondataminingapproachesforsensordataisdiscussed(Chap-ter6).Afterminingunseenbutinterestingtimeseriespatterns(Chapter7)andtemporalrulesbetweenthepatterns(Chapter8),itisshownhowtheproposedsemanticmodelinPartIcanbeappliedtolinguisticallydescribethesepatterns(Chapter9).

89

Chapter 6

An Overview of Health

Monitoring with Mining

Physiological Sensor Data

“Without data you’re just another person with anopinion.”

— W. Edwards Deming (1900–1993)

As noted in chapter 1, a framework to deal with the problem of representingand describing numerical data needs first to acquire a proper set of raw

data. Subsequently, the framework needs to turn this data into information thatis suitable to be fed to a semantic representation or any other data to text sys-tem. The second part of the thesis presents the task of mining and analysing theraw measured physiological data in order to provide a set of valuable informa-tion for the semantic model. For this reason, before going through the proposedmethods for analysing physiological data and extracting valuable information,this chapter presents an overview of current health monitoring systems withinmining physiological sensor data. This overview gives a better understandingof the existing methods for analysing such data, along with investigating theirstrengths and weaknesses to be used for the goal of data explanation in naturallanguage.

The past few years have witnessed an increase in the development of wear-able sensors for health monitoring systems. With the increase of healthcareservices in non-clinical environments using vital signs provided by wearablesensors, the need to mine and process physiological measurements is growingsignificantly. This increase has been due to several factors such as developmentin sensor technology as well as directed efforts on political and stakeholder lev-

91

Chapter6

AnOverviewofHealth

MonitoringwithMining

PhysiologicalSensorData

“Withoutdatayou’rejustanotherpersonwithanopinion.”

—W.EdwardsDeming(1900–1993)

Asnotedinchapter1,aframeworktodealwiththeproblemofrepresentinganddescribingnumericaldataneedsfirsttoacquireapropersetofraw

data.Subsequently,theframeworkneedstoturnthisdataintoinformationthatissuitabletobefedtoasemanticrepresentationoranyotherdatatotextsys-tem.Thesecondpartofthethesispresentsthetaskofminingandanalysingtherawmeasuredphysiologicaldatainordertoprovideasetofvaluableinforma-tionforthesemanticmodel.Forthisreason,beforegoingthroughtheproposedmethodsforanalysingphysiologicaldataandextractingvaluableinformation,thischapterpresentsanoverviewofcurrenthealthmonitoringsystemswithinminingphysiologicalsensordata.Thisoverviewgivesabetterunderstandingoftheexistingmethodsforanalysingsuchdata,alongwithinvestigatingtheirstrengthsandweaknessestobeusedforthegoalofdataexplanationinnaturallanguage.

Thepastfewyearshavewitnessedanincreaseinthedevelopmentofwear-ablesensorsforhealthmonitoringsystems.Withtheincreaseofhealthcareservicesinnon-clinicalenvironmentsusingvitalsignsprovidedbywearablesensors,theneedtomineandprocessphysiologicalmeasurementsisgrowingsignificantly.Thisincreasehasbeenduetoseveralfactorssuchasdevelopmentinsensortechnologyaswellasdirectedeffortsonpoliticalandstakeholderlev-

91

Chapter 6

An Overview of Health

Monitoring with Mining

Physiological Sensor Data

“Without data you’re just another person with anopinion.”

— W. Edwards Deming (1900–1993)

As noted in chapter 1, a framework to deal with the problem of representingand describing numerical data needs first to acquire a proper set of raw

data. Subsequently, the framework needs to turn this data into information thatis suitable to be fed to a semantic representation or any other data to text sys-tem. The second part of the thesis presents the task of mining and analysing theraw measured physiological data in order to provide a set of valuable informa-tion for the semantic model. For this reason, before going through the proposedmethods for analysing physiological data and extracting valuable information,this chapter presents an overview of current health monitoring systems withinmining physiological sensor data. This overview gives a better understandingof the existing methods for analysing such data, along with investigating theirstrengths and weaknesses to be used for the goal of data explanation in naturallanguage.

The past few years have witnessed an increase in the development of wear-able sensors for health monitoring systems. With the increase of healthcareservices in non-clinical environments using vital signs provided by wearablesensors, the need to mine and process physiological measurements is growingsignificantly. This increase has been due to several factors such as developmentin sensor technology as well as directed efforts on political and stakeholder lev-

91

Chapter6

AnOverviewofHealth

MonitoringwithMining

PhysiologicalSensorData

“Withoutdatayou’rejustanotherpersonwithanopinion.”

—W.EdwardsDeming(1900–1993)

Asnotedinchapter1,aframeworktodealwiththeproblemofrepresentinganddescribingnumericaldataneedsfirsttoacquireapropersetofraw

data.Subsequently,theframeworkneedstoturnthisdataintoinformationthatissuitabletobefedtoasemanticrepresentationoranyotherdatatotextsys-tem.Thesecondpartofthethesispresentsthetaskofminingandanalysingtherawmeasuredphysiologicaldatainordertoprovideasetofvaluableinforma-tionforthesemanticmodel.Forthisreason,beforegoingthroughtheproposedmethodsforanalysingphysiologicaldataandextractingvaluableinformation,thischapterpresentsanoverviewofcurrenthealthmonitoringsystemswithinminingphysiologicalsensordata.Thisoverviewgivesabetterunderstandingoftheexistingmethodsforanalysingsuchdata,alongwithinvestigatingtheirstrengthsandweaknessestobeusedforthegoalofdataexplanationinnaturallanguage.

Thepastfewyearshavewitnessedanincreaseinthedevelopmentofwear-ablesensorsforhealthmonitoringsystems.Withtheincreaseofhealthcareservicesinnon-clinicalenvironmentsusingvitalsignsprovidedbywearablesensors,theneedtomineandprocessphysiologicalmeasurementsisgrowingsignificantly.Thisincreasehasbeenduetoseveralfactorssuchasdevelopmentinsensortechnologyaswellasdirectedeffortsonpoliticalandstakeholderlev-

91

92 CHAPTER 6. AN OVERVIEW OF HEALTH MONITORING

els to promote projects which address the need for providing new methods forcare given challenges with an ageing population. An important aspect in suchsystem is how the data is treated and processed. This chapter provides a reviewof the recent methods and algorithms used to analyse data from wearable sen-sors, which are used for monitoring of physiological vital signs in healthcareservices. More precisely, it outlines the common data mining tasks that havebeen recently applied for health monitoring, such as anomaly detection, pre-diction and decision making when considering continuous time series measure-ments. Moreover, the chapter details the suitability of particular data miningand machine learning methods used to process physiological data.

In health monitoring systems the focus has been recently shifting fromthe obtaining data to developing intelligent algorithms to perform a varietyof tasks [23]. Such tasks not only include traditional pattern recognition andanomaly detection, but also consider decision-based systems which can handlecontext awareness, subject-specific models, and personalisation. As the litera-ture in this field is vast, the scope of this thesis has been limited to only coverwearable sensors that measure health parameters such as vital signs for diseasemanagement and prevention. Specifically, this review is concentrated on thefollowing vital sign parameters: electrocardiogram (ECG), oxygen saturation(SpO2), heart rate (HR), Photoplethysmography (PPG), blood glucose (BG),respiratory rate (RR), and blood pressure (BP).

Works such as [51] and [27] have focused on the needs of involving wear-able sensors and overcoming essential bottlenecks for the use of wearable sen-sors such as the clinical acceptability and interoperability in health records.Most of the recent review articles on data mining of physiological sensordata are related to general studies for healthcare, i.e., well-known problemsin healthcare with simple and routine data mining approaches [159]. Recently,Sow [212] categorised the main challenges of sensor data mining in five fol-lowing stages: acquisition, preprocessing, transformation, modelling and eval-uation. The authors in [235] and [37] have used the data mining algorithmsmainly in two categories 1) descriptive or unsupervised learning (i.e. clustering,association, summarisation) and 2) predictive or supervised learning (i.e. classi-fication, regression). However, they are lacking deeper insight into the suitabil-ity of the algorithms for handling the special characteristics of the sensor datain health monitoring systems.

6.1 Data Mining Tasks in Health Monitoring

Systems

Recently, the research area of health monitoring systems has shifted from thepure reasoning of wearable sensor readings (like calculating the sleep hoursor the number of steps per day) to the higher level of data processing to givemore valuable information to the end users. Therefore, healthcare services have

92CHAPTER6.ANOVERVIEWOFHEALTHMONITORING

elstopromoteprojectswhichaddresstheneedforprovidingnewmethodsforcaregivenchallengeswithanageingpopulation.Animportantaspectinsuchsystemishowthedataistreatedandprocessed.Thischapterprovidesareviewoftherecentmethodsandalgorithmsusedtoanalysedatafromwearablesen-sors,whichareusedformonitoringofphysiologicalvitalsignsinhealthcareservices.Moreprecisely,itoutlinesthecommondataminingtasksthathavebeenrecentlyappliedforhealthmonitoring,suchasanomalydetection,pre-dictionanddecisionmakingwhenconsideringcontinuoustimeseriesmeasure-ments.Moreover,thechapterdetailsthesuitabilityofparticulardataminingandmachinelearningmethodsusedtoprocessphysiologicaldata.

Inhealthmonitoringsystemsthefocushasbeenrecentlyshiftingfromtheobtainingdatatodevelopingintelligentalgorithmstoperformavarietyoftasks[23].Suchtasksnotonlyincludetraditionalpatternrecognitionandanomalydetection,butalsoconsiderdecision-basedsystemswhichcanhandlecontextawareness,subject-specificmodels,andpersonalisation.Asthelitera-tureinthisfieldisvast,thescopeofthisthesishasbeenlimitedtoonlycoverwearablesensorsthatmeasurehealthparameterssuchasvitalsignsfordiseasemanagementandprevention.Specifically,thisreviewisconcentratedonthefollowingvitalsignparameters:electrocardiogram(ECG),oxygensaturation(SpO2),heartrate(HR),Photoplethysmography(PPG),bloodglucose(BG),respiratoryrate(RR),andbloodpressure(BP).

Workssuchas[51]and[27]havefocusedontheneedsofinvolvingwear-ablesensorsandovercomingessentialbottlenecksfortheuseofwearablesen-sorssuchastheclinicalacceptabilityandinteroperabilityinhealthrecords.Mostoftherecentreviewarticlesondataminingofphysiologicalsensordataarerelatedtogeneralstudiesforhealthcare,i.e.,well-knownproblemsinhealthcarewithsimpleandroutinedataminingapproaches[159].Recently,Sow[212]categorisedthemainchallengesofsensordatamininginfivefol-lowingstages:acquisition,preprocessing,transformation,modellingandeval-uation.Theauthorsin[235]and[37]haveusedthedataminingalgorithmsmainlyintwocategories1)descriptiveorunsupervisedlearning(i.e.clustering,association,summarisation)and2)predictiveorsupervisedlearning(i.e.classi-fication,regression).However,theyarelackingdeeperinsightintothesuitabil-ityofthealgorithmsforhandlingthespecialcharacteristicsofthesensordatainhealthmonitoringsystems.

6.1DataMiningTasksinHealthMonitoring

Systems

Recently,theresearchareaofhealthmonitoringsystemshasshiftedfromthepurereasoningofwearablesensorreadings(likecalculatingthesleephoursorthenumberofstepsperday)tothehigherlevelofdataprocessingtogivemorevaluableinformationtotheendusers.Therefore,healthcareserviceshave


els to promote projects which address the need for providing new methods forcare given challenges with an ageing population. An important aspect in suchsystem is how the data is treated and processed. This chapter provides a reviewof the recent methods and algorithms used to analyse data from wearable sen-sors, which are used for monitoring of physiological vital signs in healthcareservices. More precisely, it outlines the common data mining tasks that havebeen recently applied for health monitoring, such as anomaly detection, pre-diction and decision making when considering continuous time series measure-ments. Moreover, the chapter details the suitability of particular data miningand machine learning methods used to process physiological data.

In health monitoring systems the focus has been recently shifting fromthe obtaining data to developing intelligent algorithms to perform a varietyof tasks [23]. Such tasks not only include traditional pattern recognition andanomaly detection, but also consider decision-based systems which can handlecontext awareness, subject-specific models, and personalisation. As the litera-ture in this field is vast, the scope of this thesis has been limited to only coverwearable sensors that measure health parameters such as vital signs for diseasemanagement and prevention. Specifically, this review is concentrated on thefollowing vital sign parameters: electrocardiogram (ECG), oxygen saturation(SpO2), heart rate (HR), Photoplethysmography (PPG), blood glucose (BG),respiratory rate (RR), and blood pressure (BP).

Works such as [51] and [27] have focused on the needs of involving wear-able sensors and overcoming essential bottlenecks for the use of wearable sen-sors such as the clinical acceptability and interoperability in health records.Most of the recent review articles on data mining of physiological sensordata are related to general studies for healthcare, i.e., well-known problemsin healthcare with simple and routine data mining approaches [159]. Recently,Sow [212] categorised the main challenges of sensor data mining in five fol-lowing stages: acquisition, preprocessing, transformation, modelling and eval-uation. The authors in [235] and [37] have used the data mining algorithmsmainly in two categories 1) descriptive or unsupervised learning (i.e. clustering,association, summarisation) and 2) predictive or supervised learning (i.e. classi-fication, regression). However, they are lacking deeper insight into the suitabil-ity of the algorithms for handling the special characteristics of the sensor datain health monitoring systems.

6.1 Data Mining Tasks in Health Monitoring

Systems

Recently, the research area of health monitoring systems has shifted from thepure reasoning of wearable sensor readings (like calculating the sleep hoursor the number of steps per day) to the higher level of data processing to givemore valuable information to the end users. Therefore, healthcare services have


elstopromoteprojectswhichaddresstheneedforprovidingnewmethodsforcaregivenchallengeswithanageingpopulation.Animportantaspectinsuchsystemishowthedataistreatedandprocessed.Thischapterprovidesareviewoftherecentmethodsandalgorithmsusedtoanalysedatafromwearablesen-sors,whichareusedformonitoringofphysiologicalvitalsignsinhealthcareservices.Moreprecisely,itoutlinesthecommondataminingtasksthathavebeenrecentlyappliedforhealthmonitoring,suchasanomalydetection,pre-dictionanddecisionmakingwhenconsideringcontinuoustimeseriesmeasure-ments.Moreover,thechapterdetailsthesuitabilityofparticulardataminingandmachinelearningmethodsusedtoprocessphysiologicaldata.

Inhealthmonitoringsystemsthefocushasbeenrecentlyshiftingfromtheobtainingdatatodevelopingintelligentalgorithmstoperformavarietyoftasks[23].Suchtasksnotonlyincludetraditionalpatternrecognitionandanomalydetection,butalsoconsiderdecision-basedsystemswhichcanhandlecontextawareness,subject-specificmodels,andpersonalisation.Asthelitera-tureinthisfieldisvast,thescopeofthisthesishasbeenlimitedtoonlycoverwearablesensorsthatmeasurehealthparameterssuchasvitalsignsfordiseasemanagementandprevention.Specifically,thisreviewisconcentratedonthefollowingvitalsignparameters:electrocardiogram(ECG),oxygensaturation(SpO2),heartrate(HR),Photoplethysmography(PPG),bloodglucose(BG),respiratoryrate(RR),andbloodpressure(BP).

Workssuchas[51]and[27]havefocusedontheneedsofinvolvingwear-ablesensorsandovercomingessentialbottlenecksfortheuseofwearablesen-sorssuchastheclinicalacceptabilityandinteroperabilityinhealthrecords.Mostoftherecentreviewarticlesondataminingofphysiologicalsensordataarerelatedtogeneralstudiesforhealthcare,i.e.,well-knownproblemsinhealthcarewithsimpleandroutinedataminingapproaches[159].Recently,Sow[212]categorisedthemainchallengesofsensordatamininginfivefol-lowingstages:acquisition,preprocessing,transformation,modellingandeval-uation.Theauthorsin[235]and[37]haveusedthedataminingalgorithmsmainlyintwocategories1)descriptiveorunsupervisedlearning(i.e.clustering,association,summarisation)and2)predictiveorsupervisedlearning(i.e.classi-fication,regression).However,theyarelackingdeeperinsightintothesuitabil-ityofthealgorithmsforhandlingthespecialcharacteristicsofthesensordatainhealthmonitoringsystems.

6.1DataMiningTasksinHealthMonitoring

Systems

Recently,theresearchareaofhealthmonitoringsystemshasshiftedfromthepurereasoningofwearablesensorreadings(likecalculatingthesleephoursorthenumberofstepsperday)tothehigherlevelofdataprocessingtogivemorevaluableinformationtotheendusers.Therefore,healthcareserviceshave

6.1. DATA MINING TASKS IN HEALTH MONITORING SYSTEMS 93

been focusing on more in-depth data mining tasks to be achieved. Based onthe selected literature, three types of data mining tasks as the objectives ofhealthcare systems are predominant. These three tasks are: 1) prediction, 2)anomaly detection which include the subtask of raising alarms, and 3) diagnosisas a decision making process.

Figure 6.1 provides a depiction of each task concerning three dimensions oraspects. The first dimension involves the setting in which the health monitoringoccurs (home/remote or clinical settings). Most monitoring applications whichconsider remote settings deal predominantly with prediction and anomaly de-tection tasks, whereas the applications in clinical settings are typically focusedon diagnosis task [27, 216]. This fact is explained by the growing desire tohave a more preventative approach (prediction) via wearable sensors and toconsider the possibility to facilitate independent living in home environmentsby increasing the sense of security (alarm). Similarly, in clinical settings muchmore information is available to provide diagnosis and assist in decision mak-ing [37]. The second dimension in the figure shows the type of subjects that areunder observation (healthy or patient). For patients with known disease andmedical records, both diagnosis and precisely the possibility to raise alarms arethe essential tasks. For health monitoring which typically include healthy in-dividuals who want to ensure the maintenance of good health, prediction andanomaly detection are the considered tasks in the literature [158]. The final di-mension depicted in the figure considers how the acquired data sets have beenprocessed (online/offline). For all three mentioned tasks, the data sets have beenaddressed both online and offline manners. However, most of the alarm-relatedtasks are naturally investigated in the context of online and continuous moni-toring [212].

Anomaly Detection

Anomaly detection is the task of identifying unusual patterns which do notconform to the expected behaviour of the data [52]. Detected unusual patternsin health parameters, especially for home monitoring systems, enables the clin-icians to make accurate decisions in short time [53]. Anomaly detection tech-niques are often developed based on classification methods to distinguish thedata set into the regular classes and the outliers [97]. For example, support vec-tor machines [141], Markov models [243] and Wavelet analysis [100] are usedin healthcare systems for anomaly detection. Most of the related works usingthe anomaly detection approach usually deal with short-term [118] and multi-variate data sets [61] to characterise the entire the data to find discords. Someof the studies considered finding irregular patterns in vital signs time series suchas abnormal episodes in ECG [61, 222] and SpO2 [54], which mostly discoverunusual temporal patterns in continuous data. In online healthcare systems,alarms as soon as detecting any anomaly in vital signs will be triggered to havean instant reaction. Such alarm system is designed for monitoring patients in

6.1.DATAMININGTASKSINHEALTHMONITORINGSYSTEMS93

beenfocusingonmorein-depthdataminingtaskstobeachieved.Basedontheselectedliterature,threetypesofdataminingtasksastheobjectivesofhealthcaresystemsarepredominant.Thesethreetasksare:1)prediction,2)anomalydetectionwhichincludethesubtaskofraisingalarms,and3)diagnosisasadecisionmakingprocess.

Figure6.1providesadepictionofeachtaskconcerningthreedimensionsoraspects.Thefirstdimensioninvolvesthesettinginwhichthehealthmonitoringoccurs(home/remoteorclinicalsettings).Mostmonitoringapplicationswhichconsiderremotesettingsdealpredominantlywithpredictionandanomalyde-tectiontasks,whereastheapplicationsinclinicalsettingsaretypicallyfocusedondiagnosistask[27,216].Thisfactisexplainedbythegrowingdesiretohaveamorepreventativeapproach(prediction)viawearablesensorsandtoconsiderthepossibilitytofacilitateindependentlivinginhomeenvironmentsbyincreasingthesenseofsecurity(alarm).Similarly,inclinicalsettingsmuchmoreinformationisavailabletoprovidediagnosisandassistindecisionmak-ing[37].Theseconddimensioninthefigureshowsthetypeofsubjectsthatareunderobservation(healthyorpatient).Forpatientswithknowndiseaseandmedicalrecords,bothdiagnosisandpreciselythepossibilitytoraisealarmsaretheessentialtasks.Forhealthmonitoringwhichtypicallyincludehealthyin-dividualswhowanttoensurethemaintenanceofgoodhealth,predictionandanomalydetectionaretheconsideredtasksintheliterature[158].Thefinaldi-mensiondepictedinthefigureconsidershowtheacquireddatasetshavebeenprocessed(online/offline).Forallthreementionedtasks,thedatasetshavebeenaddressedbothonlineandofflinemanners.However,mostofthealarm-relatedtasksarenaturallyinvestigatedinthecontextofonlineandcontinuousmoni-toring[212].

AnomalyDetection

Anomalydetectionisthetaskofidentifyingunusualpatternswhichdonotconformtotheexpectedbehaviourofthedata[52].Detectedunusualpatternsinhealthparameters,especiallyforhomemonitoringsystems,enablestheclin-icianstomakeaccuratedecisionsinshorttime[53].Anomalydetectiontech-niquesareoftendevelopedbasedonclassificationmethodstodistinguishthedatasetintotheregularclassesandtheoutliers[97].Forexample,supportvec-tormachines[141],Markovmodels[243]andWaveletanalysis[100]areusedinhealthcaresystemsforanomalydetection.Mostoftherelatedworksusingtheanomalydetectionapproachusuallydealwithshort-term[118]andmulti-variatedatasets[61]tocharacterisetheentirethedatatofinddiscords.SomeofthestudiesconsideredfindingirregularpatternsinvitalsignstimeseriessuchasabnormalepisodesinECG[61,222]andSpO2[54],whichmostlydiscoverunusualtemporalpatternsincontinuousdata.Inonlinehealthcaresystems,alarmsassoonasdetectinganyanomalyinvitalsignswillbetriggeredtohaveaninstantreaction.Suchalarmsystemisdesignedformonitoringpatientsin

6.1. DATA MINING TASKS IN HEALTH MONITORING SYSTEMS 93

been focusing on more in-depth data mining tasks to be achieved. Based onthe selected literature, three types of data mining tasks as the objectives ofhealthcare systems are predominant. These three tasks are: 1) prediction, 2)anomaly detection which include the subtask of raising alarms, and 3) diagnosisas a decision making process.

Figure 6.1 provides a depiction of each task concerning three dimensions oraspects. The first dimension involves the setting in which the health monitoringoccurs (home/remote or clinical settings). Most monitoring applications whichconsider remote settings deal predominantly with prediction and anomaly de-tection tasks, whereas the applications in clinical settings are typically focusedon diagnosis task [27, 216]. This fact is explained by the growing desire tohave a more preventative approach (prediction) via wearable sensors and toconsider the possibility to facilitate independent living in home environmentsby increasing the sense of security (alarm). Similarly, in clinical settings muchmore information is available to provide diagnosis and assist in decision mak-ing [37]. The second dimension in the figure shows the type of subjects that areunder observation (healthy or patient). For patients with known disease andmedical records, both diagnosis and precisely the possibility to raise alarms arethe essential tasks. For health monitoring which typically include healthy in-dividuals who want to ensure the maintenance of good health, prediction andanomaly detection are the considered tasks in the literature [158]. The final di-mension depicted in the figure considers how the acquired data sets have beenprocessed (online/offline). For all three mentioned tasks, the data sets have beenaddressed both online and offline manners. However, most of the alarm-relatedtasks are naturally investigated in the context of online and continuous moni-toring [212].

Anomaly Detection

Anomaly detection is the task of identifying unusual patterns which do notconform to the expected behaviour of the data [52]. Detected unusual patternsin health parameters, especially for home monitoring systems, enables the clin-icians to make accurate decisions in short time [53]. Anomaly detection tech-niques are often developed based on classification methods to distinguish thedata set into the regular classes and the outliers [97]. For example, support vec-tor machines [141], Markov models [243] and Wavelet analysis [100] are usedin healthcare systems for anomaly detection. Most of the related works usingthe anomaly detection approach usually deal with short-term [118] and multi-variate data sets [61] to characterise the entire the data to find discords. Someof the studies considered finding irregular patterns in vital signs time series suchas abnormal episodes in ECG [61, 222] and SpO2 [54], which mostly discoverunusual temporal patterns in continuous data. In online healthcare systems,alarms as soon as detecting any anomaly in vital signs will be triggered to havean instant reaction. Such alarm system is designed for monitoring patients in

6.1.DATAMININGTASKSINHEALTHMONITORINGSYSTEMS93

beenfocusingonmorein-depthdataminingtaskstobeachieved.Basedontheselectedliterature,threetypesofdataminingtasksastheobjectivesofhealthcaresystemsarepredominant.Thesethreetasksare:1)prediction,2)anomalydetectionwhichincludethesubtaskofraisingalarms,and3)diagnosisasadecisionmakingprocess.

Figure6.1providesadepictionofeachtaskconcerningthreedimensionsoraspects.Thefirstdimensioninvolvesthesettinginwhichthehealthmonitoringoccurs(home/remoteorclinicalsettings).Mostmonitoringapplicationswhichconsiderremotesettingsdealpredominantlywithpredictionandanomalyde-tectiontasks,whereastheapplicationsinclinicalsettingsaretypicallyfocusedondiagnosistask[27,216].Thisfactisexplainedbythegrowingdesiretohaveamorepreventativeapproach(prediction)viawearablesensorsandtoconsiderthepossibilitytofacilitateindependentlivinginhomeenvironmentsbyincreasingthesenseofsecurity(alarm).Similarly,inclinicalsettingsmuchmoreinformationisavailabletoprovidediagnosisandassistindecisionmak-ing[37].Theseconddimensioninthefigureshowsthetypeofsubjectsthatareunderobservation(healthyorpatient).Forpatientswithknowndiseaseandmedicalrecords,bothdiagnosisandpreciselythepossibilitytoraisealarmsaretheessentialtasks.Forhealthmonitoringwhichtypicallyincludehealthyin-dividualswhowanttoensurethemaintenanceofgoodhealth,predictionandanomalydetectionaretheconsideredtasksintheliterature[158].Thefinaldi-mensiondepictedinthefigureconsidershowtheacquireddatasetshavebeenprocessed(online/offline).Forallthreementionedtasks,thedatasetshavebeenaddressedbothonlineandofflinemanners.However,mostofthealarm-relatedtasksarenaturallyinvestigatedinthecontextofonlineandcontinuousmoni-toring[212].

AnomalyDetection

Anomalydetectionisthetaskofidentifyingunusualpatternswhichdonotconformtotheexpectedbehaviourofthedata[52].Detectedunusualpatternsinhealthparameters,especiallyforhomemonitoringsystems,enablestheclin-icianstomakeaccuratedecisionsinshorttime[53].Anomalydetectiontech-niquesareoftendevelopedbasedonclassificationmethodstodistinguishthedatasetintotheregularclassesandtheoutliers[97].Forexample,supportvec-tormachines[141],Markovmodels[243]andWaveletanalysis[100]areusedinhealthcaresystemsforanomalydetection.Mostoftherelatedworksusingtheanomalydetectionapproachusuallydealwithshort-term[118]andmulti-variatedatasets[61]tocharacterisetheentirethedatatofinddiscords.SomeofthestudiesconsideredfindingirregularpatternsinvitalsignstimeseriessuchasabnormalepisodesinECG[61,222]andSpO2[54],whichmostlydiscoverunusualtemporalpatternsincontinuousdata.Inonlinehealthcaresystems,alarmsassoonasdetectinganyanomalyinvitalsignswillbetriggeredtohaveaninstantreaction.Suchalarmsystemisdesignedformonitoringpatientsin


Figure 6.1: A schematic overview of the position of the main data mining tasks(anomaly detection, prediction, and diagnosis/decision making) concerning thedifferent aspects of wearable sensing in the health monitoring systems.

clinical units [54]. However, anomaly detection task can be obtained by offlinetechniques in a sense to detect abnormal readings for subjects based on theirhistorical measurements [243].

Prediction

Prediction task has been widely considered in data mining field that identifiesthe upcoming events based on the previously recorded information. This ap-proach is getting more interesting for the healthcare providers in the medicaldomain since it prevents further chronic problems [37] and leads to make deci-sions about prognosis [38]. The role of the predictive data mining consideringwearable sensors is non-trivial due to the requirement of modelling sequentialpatterns acquired from vital signs. This approach is also known as a supervisedlearning model [235]. As the typical examples of the predictive models, authorsin [205, 218] presented a method which predicts the further stress levels of asubject. A similar case of using predictive models in healthcare are blood glu-cose level prediction [55], mortality prediction by clustering electronic healthdata [153], and a predictive decision making system for dialysis patients [234].Recently, there are new studies concerning prediction tasks, which have usedexperimental wearable sensor data to perform in non-clinical settings [60,97].


Figure6.1:Aschematicoverviewofthepositionofthemaindataminingtasks(anomalydetection,prediction,anddiagnosis/decisionmaking)concerningthedifferentaspectsofwearablesensinginthehealthmonitoringsystems.

clinicalunits[54].However,anomalydetectiontaskcanbeobtainedbyofflinetechniquesinasensetodetectabnormalreadingsforsubjectsbasedontheirhistoricalmeasurements[243].

Prediction

Predictiontaskhasbeenwidelyconsideredindataminingfieldthatidentifiestheupcomingeventsbasedonthepreviouslyrecordedinformation.Thisap-proachisgettingmoreinterestingforthehealthcareprovidersinthemedicaldomainsinceitpreventsfurtherchronicproblems[37]andleadstomakedeci-sionsaboutprognosis[38].Theroleofthepredictivedataminingconsideringwearablesensorsisnon-trivialduetotherequirementofmodellingsequentialpatternsacquiredfromvitalsigns.Thisapproachisalsoknownasasupervisedlearningmodel[235].Asthetypicalexamplesofthepredictivemodels,authorsin[205,218]presentedamethodwhichpredictsthefurtherstresslevelsofasubject.Asimilarcaseofusingpredictivemodelsinhealthcarearebloodglu-coselevelprediction[55],mortalitypredictionbyclusteringelectronichealthdata[153],andapredictivedecisionmakingsystemfordialysispatients[234].Recently,therearenewstudiesconcerningpredictiontasks,whichhaveusedexperimentalwearablesensordatatoperforminnon-clinicalsettings[60,97].


Figure 6.1: A schematic overview of the position of the main data mining tasks(anomaly detection, prediction, and diagnosis/decision making) concerning thedifferent aspects of wearable sensing in the health monitoring systems.

clinical units [54]. However, anomaly detection task can be obtained by offlinetechniques in a sense to detect abnormal readings for subjects based on theirhistorical measurements [243].

Prediction

Prediction task has been widely considered in data mining field that identifiesthe upcoming events based on the previously recorded information. This ap-proach is getting more interesting for the healthcare providers in the medicaldomain since it prevents further chronic problems [37] and leads to make deci-sions about prognosis [38]. The role of the predictive data mining consideringwearable sensors is non-trivial due to the requirement of modelling sequentialpatterns acquired from vital signs. This approach is also known as a supervisedlearning model [235]. As the typical examples of the predictive models, authorsin [205, 218] presented a method which predicts the further stress levels of asubject. A similar case of using predictive models in healthcare are blood glu-cose level prediction [55], mortality prediction by clustering electronic healthdata [153], and a predictive decision making system for dialysis patients [234].Recently, there are new studies concerning prediction tasks, which have usedexperimental wearable sensor data to perform in non-clinical settings [60,97].


Figure6.1:Aschematicoverviewofthepositionofthemaindataminingtasks(anomalydetection,prediction,anddiagnosis/decisionmaking)concerningthedifferentaspectsofwearablesensinginthehealthmonitoringsystems.

clinicalunits[54].However,anomalydetectiontaskcanbeobtainedbyofflinetechniquesinasensetodetectabnormalreadingsforsubjectsbasedontheirhistoricalmeasurements[243].

Prediction

Predictiontaskhasbeenwidelyconsideredindataminingfieldthatidentifiestheupcomingeventsbasedonthepreviouslyrecordedinformation.Thisap-proachisgettingmoreinterestingforthehealthcareprovidersinthemedicaldomainsinceitpreventsfurtherchronicproblems[37]andleadstomakedeci-sionsaboutprognosis[38].Theroleofthepredictivedataminingconsideringwearablesensorsisnon-trivialduetotherequirementofmodellingsequentialpatternsacquiredfromvitalsigns.Thisapproachisalsoknownasasupervisedlearningmodel[235].Asthetypicalexamplesofthepredictivemodels,authorsin[205,218]presentedamethodwhichpredictsthefurtherstresslevelsofasubject.Asimilarcaseofusingpredictivemodelsinhealthcarearebloodglu-coselevelprediction[55],mortalitypredictionbyclusteringelectronichealthdata[153],andapredictivedecisionmakingsystemfordialysispatients[234].Recently,therearenewstudiesconcerningpredictiontasks,whichhaveusedexperimentalwearablesensordatatoperforminnon-clinicalsettings[60,97].

6.2. DATA MINING IN HEALTH MONITORING SYSTEMS 95

Diagnosis/Decision Making

Decision making in diagnosis is one of the main tasks of clinical monitoringsystems which is often based on retrieved knowledge using vital signs, andalso other information such as electronic health records and metadata [210].Some examples of recent works on diagnosis/decision making tasks consistof: estimating the severity of health episodes of patients suffering chronic dis-ease [40, 41, 101], sleep issues such as polysomnography and apnea [43, 232],estimation and classification of health conditions [169, 226], and emotionrecognition [81]. Most of these studies have used online databases with an-notated episodes to have sufficient and trustable real-world disease labels toevaluate the decision making process. Considering the complexity of the datato infer diagnosis, some researchers frequently used classification methods suchas neural network [128] and decision trees [81] on short-term clinical data sets.

6.2 Data Mining in Health Monitoring Systems

In health monitoring systems, the role of data analysis is to extract informationfrom the low-level sensor data and bridge them to the high-level knowledgerepresentation. For this reason, recent health monitoring systems have givenmore attention to the data processing phase to catch more valuable informa-tion based on the expert user requirements. Data mining techniques that havebeen applied to wearable sensor data in health monitoring systems have varied,and it is also not uncommon that several techniques are used within the samearchitecture.

Regardless of the data mining technique used, the most standard and widelyused approaches to mining information from sensor data sets are given in Fig-ure 6.2. Acquiring the data sets as the input of the architecture is discussedin Section 6.3, and the data mining tasks as the goals of the architecture arepresented in Section 6.1. The main steps of the data mining approach consistof 1) data preprocessing, 2) feature extraction, and 3) modelling and learningthe information (considering expert knowledge and metadata) to perform thedefined tasks.

It should be noted that parameters such as expert knowledge, histori-cal data measurements, electronic health records, and stable parameters (e.g.sex, age) are essential in a data mining method. These parameters (metadata)provide contextual analysis and improve the process of knowledge extrac-tion [100, 227]. One example is that every healthcare system, that uses HRsensor data, needs to investigate the effect of metadata such as age, sex, weight,and medicine in order to have meaningful reasoning in order to find i.e., basicabnormal heart rates or to personalise the critical pulses based on the men-tioned metadata [98,115].

6.2.DATAMININGINHEALTHMONITORINGSYSTEMS95

Diagnosis/DecisionMaking

Decisionmakingindiagnosisisoneofthemaintasksofclinicalmonitoringsystemswhichisoftenbasedonretrievedknowledgeusingvitalsigns,andalsootherinformationsuchaselectronichealthrecordsandmetadata[210].Someexamplesofrecentworksondiagnosis/decisionmakingtasksconsistof:estimatingtheseverityofhealthepisodesofpatientssufferingchronicdis-ease[40,41,101],sleepissuessuchaspolysomnographyandapnea[43,232],estimationandclassificationofhealthconditions[169,226],andemotionrecognition[81].Mostofthesestudieshaveusedonlinedatabaseswithan-notatedepisodestohavesufficientandtrustablereal-worlddiseaselabelstoevaluatethedecisionmakingprocess.Consideringthecomplexityofthedatatoinferdiagnosis,someresearchersfrequentlyusedclassificationmethodssuchasneuralnetwork[128]anddecisiontrees[81]onshort-termclinicaldatasets.

6.2DataMininginHealthMonitoringSystems

Inhealthmonitoringsystems,theroleofdataanalysisistoextractinformationfromthelow-levelsensordataandbridgethemtothehigh-levelknowledgerepresentation.Forthisreason,recenthealthmonitoringsystemshavegivenmoreattentiontothedataprocessingphasetocatchmorevaluableinforma-tionbasedontheexpertuserrequirements.Dataminingtechniquesthathavebeenappliedtowearablesensordatainhealthmonitoringsystemshavevaried,anditisalsonotuncommonthatseveraltechniquesareusedwithinthesamearchitecture.

Regardlessofthedataminingtechniqueused,themoststandardandwidelyusedapproachestomininginformationfromsensordatasetsaregiveninFig-ure6.2.AcquiringthedatasetsastheinputofthearchitectureisdiscussedinSection6.3,andthedataminingtasksasthegoalsofthearchitecturearepresentedinSection6.1.Themainstepsofthedataminingapproachconsistof1)datapreprocessing,2)featureextraction,and3)modellingandlearningtheinformation(consideringexpertknowledgeandmetadata)toperformthedefinedtasks.

Itshouldbenotedthatparameterssuchasexpertknowledge,histori-caldatameasurements,electronichealthrecords,andstableparameters(e.g.sex,age)areessentialinadataminingmethod.Theseparameters(metadata)providecontextualanalysisandimprovetheprocessofknowledgeextrac-tion[100,227].Oneexampleisthateveryhealthcaresystem,thatusesHRsensordata,needstoinvestigatetheeffectofmetadatasuchasage,sex,weight,andmedicineinordertohavemeaningfulreasoninginordertofindi.e.,basicabnormalheartratesortopersonalisethecriticalpulsesbasedonthemen-tionedmetadata[98,115].


Diagnosis/Decision Making

Decision making in diagnosis is one of the main tasks of clinical monitoringsystems which is often based on retrieved knowledge using vital signs, andalso other information such as electronic health records and metadata [210].Some examples of recent works on diagnosis/decision making tasks consistof: estimating the severity of health episodes of patients suffering chronic dis-ease [40, 41, 101], sleep issues such as polysomnography and apnea [43, 232],estimation and classification of health conditions [169, 226], and emotionrecognition [81]. Most of these studies have used online databases with an-notated episodes to have sufficient and trustable real-world disease labels toevaluate the decision making process. Considering the complexity of the datato infer diagnosis, some researchers frequently used classification methods suchas neural network [128] and decision trees [81] on short-term clinical data sets.

6.2 Data Mining in Health Monitoring Systems

In health monitoring systems, the role of data analysis is to extract informationfrom the low-level sensor data and bridge them to the high-level knowledgerepresentation. For this reason, recent health monitoring systems have givenmore attention to the data processing phase to catch more valuable informa-tion based on the expert user requirements. Data mining techniques that havebeen applied to wearable sensor data in health monitoring systems have varied,and it is also not uncommon that several techniques are used within the samearchitecture.

Regardless of the data mining technique used, the most standard and widelyused approaches to mining information from sensor data sets are given in Fig-ure 6.2. Acquiring the data sets as the input of the architecture is discussedin Section 6.3, and the data mining tasks as the goals of the architecture arepresented in Section 6.1. The main steps of the data mining approach consistof 1) data preprocessing, 2) feature extraction, and 3) modelling and learningthe information (considering expert knowledge and metadata) to perform thedefined tasks.

It should be noted that parameters such as expert knowledge, histori-cal data measurements, electronic health records, and stable parameters (e.g.sex, age) are essential in a data mining method. These parameters (metadata)provide contextual analysis and improve the process of knowledge extrac-tion [100, 227]. One example is that every healthcare system, that uses HRsensor data, needs to investigate the effect of metadata such as age, sex, weight,and medicine in order to have meaningful reasoning in order to find i.e., basicabnormal heart rates or to personalise the critical pulses based on the men-tioned metadata [98,115].


Diagnosis/DecisionMaking

Decisionmakingindiagnosisisoneofthemaintasksofclinicalmonitoringsystemswhichisoftenbasedonretrievedknowledgeusingvitalsigns,andalsootherinformationsuchaselectronichealthrecordsandmetadata[210].Someexamplesofrecentworksondiagnosis/decisionmakingtasksconsistof:estimatingtheseverityofhealthepisodesofpatientssufferingchronicdis-ease[40,41,101],sleepissuessuchaspolysomnographyandapnea[43,232],estimationandclassificationofhealthconditions[169,226],andemotionrecognition[81].Mostofthesestudieshaveusedonlinedatabaseswithan-notatedepisodestohavesufficientandtrustablereal-worlddiseaselabelstoevaluatethedecisionmakingprocess.Consideringthecomplexityofthedatatoinferdiagnosis,someresearchersfrequentlyusedclassificationmethodssuchasneuralnetwork[128]anddecisiontrees[81]onshort-termclinicaldatasets.

6.2DataMininginHealthMonitoringSystems

Inhealthmonitoringsystems,theroleofdataanalysisistoextractinformationfromthelow-levelsensordataandbridgethemtothehigh-levelknowledgerepresentation.Forthisreason,recenthealthmonitoringsystemshavegivenmoreattentiontothedataprocessingphasetocatchmorevaluableinforma-tionbasedontheexpertuserrequirements.Dataminingtechniquesthathavebeenappliedtowearablesensordatainhealthmonitoringsystemshavevaried,anditisalsonotuncommonthatseveraltechniquesareusedwithinthesamearchitecture.

Regardlessofthedataminingtechniqueused,themoststandardandwidelyusedapproachestomininginformationfromsensordatasetsaregiveninFig-ure6.2.AcquiringthedatasetsastheinputofthearchitectureisdiscussedinSection6.3,andthedataminingtasksasthegoalsofthearchitecturearepresentedinSection6.1.Themainstepsofthedataminingapproachconsistof1)datapreprocessing,2)featureextraction,and3)modellingandlearningtheinformation(consideringexpertknowledgeandmetadata)toperformthedefinedtasks.

Itshouldbenotedthatparameterssuchasexpertknowledge,histori-caldatameasurements,electronichealthrecords,andstableparameters(e.g.sex,age)areessentialinadataminingmethod.Theseparameters(metadata)providecontextualanalysisandimprovetheprocessofknowledgeextrac-tion[100,227].Oneexampleisthateveryhealthcaresystem,thatusesHRsensordata,needstoinvestigatetheeffectofmetadatasuchasage,sex,weight,andmedicineinordertohavemeaningfulreasoninginordertofindi.e.,basicabnormalheartratesortopersonalisethecriticalpulsesbasedonthemen-tionedmetadata[98,115].


Figure 6.2: A generic architecture of the primary data mining approach forwearable sensor data.

6.2.1 Preprocessing

Due to the occurrence of noise, motion artefacts, and sensor errors in any wear-able sensor networks, preprocessing of the raw data is necessary. Preprocess-ing in the healthcare domain involves 1) filter unusual data to remove arte-facts [22, 152, 208], and 2) remove high frequency noise [81, 117, 137]. Themain challenges of the preprocessing phase in healthcare systems are addressedin [212] which includes data formatting, data normalisation, and data synchro-nisation. Since the gathered sensor data is often unreliable and massive, studiesworking with large-scale and continuous data have necessarily employed a pre-processing step [101].

6.2.2 Feature Extraction/Selection

According to the magnitude and complexity of the raw data, feature extractionprovides a representation of the sensor data which can formulate the relationof raw data with the expected knowledge for decision making [39]. Moreover,reducing the amount of sensor data is another task in feature extraction andselection phases which leads to having an arranged vector of features as aninput of data mining techniques like classifier methods [101,152].

Since Wearable sensor data that provide monitoring of vital sign parameterstend to be continuous time series readings, most of the considered featuresare related to the properties of time series signals [60]. Two main aspects ofanalysing signals are time domain and spectral domain [24]. In physiologicaldata, the time-domain features are common, because the traditional decisionmaking frameworks on vital signs have relied on the observable trends in thesignal [54]. However, for extra knowledge about the periodic behaviour of timeseries, research in the medical field has given more attention to the featuresacquired from frequency-domains [100, 101]. Table 6.1 summarises the mostappeared features in the literature that extracted from wearable sensor data.


Figure6.2:Agenericarchitectureoftheprimarydataminingapproachforwearablesensordata.

6.2.1Preprocessing

Duetotheoccurrenceofnoise,motionartefacts,andsensorerrorsinanywear-ablesensornetworks,preprocessingoftherawdataisnecessary.Preprocess-inginthehealthcaredomaininvolves1)filterunusualdatatoremovearte-facts[22,152,208],and2)removehighfrequencynoise[81,117,137].Themainchallengesofthepreprocessingphaseinhealthcaresystemsareaddressedin[212]whichincludesdataformatting,datanormalisation,anddatasynchro-nisation.Sincethegatheredsensordataisoftenunreliableandmassive,studiesworkingwithlarge-scaleandcontinuousdatahavenecessarilyemployedapre-processingstep[101].

6.2.2FeatureExtraction/Selection

Accordingtothemagnitudeandcomplexityoftherawdata,featureextractionprovidesarepresentationofthesensordatawhichcanformulatetherelationofrawdatawiththeexpectedknowledgefordecisionmaking[39].Moreover,reducingtheamountofsensordataisanothertaskinfeatureextractionandselectionphaseswhichleadstohavinganarrangedvectoroffeaturesasaninputofdataminingtechniqueslikeclassifiermethods[101,152].

SinceWearablesensordatathatprovidemonitoringofvitalsignparameterstendtobecontinuoustimeseriesreadings,mostoftheconsideredfeaturesarerelatedtothepropertiesoftimeseriessignals[60].Twomainaspectsofanalysingsignalsaretimedomainandspectraldomain[24].Inphysiologicaldata,thetime-domainfeaturesarecommon,becausethetraditionaldecisionmakingframeworksonvitalsignshavereliedontheobservabletrendsinthesignal[54].However,forextraknowledgeabouttheperiodicbehaviouroftimeseries,researchinthemedicalfieldhasgivenmoreattentiontothefeaturesacquiredfromfrequency-domains[100,101].Table6.1summarisesthemostappearedfeaturesintheliteraturethatextractedfromwearablesensordata.


Figure 6.2: A generic architecture of the primary data mining approach forwearable sensor data.

6.2.1 Preprocessing

Due to the occurrence of noise, motion artefacts, and sensor errors in any wear-able sensor networks, preprocessing of the raw data is necessary. Preprocess-ing in the healthcare domain involves 1) filter unusual data to remove arte-facts [22, 152, 208], and 2) remove high frequency noise [81, 117, 137]. Themain challenges of the preprocessing phase in healthcare systems are addressedin [212] which includes data formatting, data normalisation, and data synchro-nisation. Since the gathered sensor data is often unreliable and massive, studiesworking with large-scale and continuous data have necessarily employed a pre-processing step [101].

6.2.2 Feature Extraction/Selection

According to the magnitude and complexity of the raw data, feature extractionprovides a representation of the sensor data which can formulate the relationof raw data with the expected knowledge for decision making [39]. Moreover,reducing the amount of sensor data is another task in feature extraction andselection phases which leads to having an arranged vector of features as aninput of data mining techniques like classifier methods [101,152].

Since Wearable sensor data that provide monitoring of vital sign parameterstend to be continuous time series readings, most of the considered featuresare related to the properties of time series signals [60]. Two main aspects ofanalysing signals are time domain and spectral domain [24]. In physiologicaldata, the time-domain features are common, because the traditional decisionmaking frameworks on vital signs have relied on the observable trends in thesignal [54]. However, for extra knowledge about the periodic behaviour of timeseries, research in the medical field has given more attention to the featuresacquired from frequency-domains [100, 101]. Table 6.1 summarises the mostappeared features in the literature that extracted from wearable sensor data.


Figure6.2:Agenericarchitectureoftheprimarydataminingapproachforwearablesensordata.

6.2.1Preprocessing

Duetotheoccurrenceofnoise,motionartefacts,andsensorerrorsinanywear-ablesensornetworks,preprocessingoftherawdataisnecessary.Preprocess-inginthehealthcaredomaininvolves1)filterunusualdatatoremovearte-facts[22,152,208],and2)removehighfrequencynoise[81,117,137].Themainchallengesofthepreprocessingphaseinhealthcaresystemsareaddressedin[212]whichincludesdataformatting,datanormalisation,anddatasynchro-nisation.Sincethegatheredsensordataisoftenunreliableandmassive,studiesworkingwithlarge-scaleandcontinuousdatahavenecessarilyemployedapre-processingstep[101].

6.2.2FeatureExtraction/Selection

Accordingtothemagnitudeandcomplexityoftherawdata,featureextractionprovidesarepresentationofthesensordatawhichcanformulatetherelationofrawdatawiththeexpectedknowledgefordecisionmaking[39].Moreover,reducingtheamountofsensordataisanothertaskinfeatureextractionandselectionphaseswhichleadstohavinganarrangedvectoroffeaturesasaninputofdataminingtechniqueslikeclassifiermethods[101,152].

SinceWearablesensordatathatprovidemonitoringofvitalsignparameterstendtobecontinuoustimeseriesreadings,mostoftheconsideredfeaturesarerelatedtothepropertiesoftimeseriessignals[60].Twomainaspectsofanalysingsignalsaretimedomainandspectraldomain[24].Inphysiologicaldata,thetime-domainfeaturesarecommon,becausethetraditionaldecisionmakingframeworksonvitalsignshavereliedontheobservabletrendsinthesignal[54].However,forextraknowledgeabouttheperiodicbehaviouroftimeseries,researchinthemedicalfieldhasgivenmoreattentiontothefeaturesacquiredfromfrequency-domains[100,101].Table6.1summarisesthemostappearedfeaturesintheliteraturethatextractedfromwearablesensordata.


Time Domain Spectral Domain Other Features

ECG

mean R-R, std R-R, meanHR, std HR [218], numberof R-R interval [141], meanR-R, std R-R interval [39]

spectral energy [117,141],power spectral density [222],

low-pass filter [101],low/high frequency [39,218]

-

SpO2

mean, zero crossing counts,entropy [232], mean,

slope [22],self-similarity [152]

energy, low frequency [152]drift from normal-ity range [22], en-tropy [152]

HRmean, slope [22], mean,self-similarity, std [152]

energy, low/highfrequency [152], low/highfrequency [208], wavelet

coefficients of datasegments [101], low/highfrequency, power spectral

density [60]

Drift from nor-mality range [22],Entropy, Co-occurrence coeffi-cients [152]

PPGrise times, max, min,

mean [208]low/high frequency [208] -

BP Mean, Slope [22] - rule based [13]

RR max, min, mean [39]. -residual and tidalvolume [39].

OtherSensorData

zero crossings count, peakvalue, rise time

(EMG) [175], mean,duration (GSR) [208], pickvalue, min, max (SCR) [81],

total magnitude, duration(GSR) [218].

spectral energy (EEG) [141],median and mean frequency,

spectral energy(EMG) [175], energy

(GSR) [208].

bandwidth,peaks count(GSR) [208].

Table 6.1: The summarisation of the most commonly used features of eachwearable sensor data in the literature.

6.2.3 Modelling and Learning Methods

Appropriate data processing techniques are essential, in order to make senseof the data [212]. This section briefly outlines the most common data miningalgorithms used for modelling and learning sensor data.

Neural networks Neural Network (NN) is an artificial intelligence approachwhich is widely used for classification and prediction [168]. Due to the pre-dictive performance of NN, it is presently the most popular data modellingmethod used in the medical domain [38]. The NN is able to profoundly modelnonlinear systems such as physiological records where the correlation of theinput parameters is not easily detectable [21]. A wide range of the diagno-sis and decision making tasks has been done by NN in the healthcare sys-


TimeDomainSpectralDomainOtherFeatures

ECG

meanR-R,stdR-R,meanHR,stdHR[218],numberofR-Rinterval[141],meanR-R,stdR-Rinterval[39]

spectralenergy[117,141],powerspectraldensity[222],

low-passfilter[101],low/highfrequency[39,218]

-

SpO2

mean,zerocrossingcounts,entropy[232],mean,

slope[22],self-similarity[152]

energy,lowfrequency[152]driftfromnormal-ityrange[22],en-tropy[152]

HRmean,slope[22],mean,self-similarity,std[152]

energy,low/highfrequency[152],low/highfrequency[208],wavelet

coefficientsofdatasegments[101],low/highfrequency,powerspectral

density[60]

Driftfromnor-malityrange[22],Entropy,Co-occurrencecoeffi-cients[152]

PPGrisetimes,max,min,

mean[208]low/highfrequency[208]-

BPMean,Slope[22]-rulebased[13]

RRmax,min,mean[39].-residualandtidalvolume[39].

OtherSensorData

zerocrossingscount,peakvalue,risetime

(EMG)[175],mean,duration(GSR)[208],pickvalue,min,max(SCR)[81],

totalmagnitude,duration(GSR)[218].

spectralenergy(EEG)[141],medianandmeanfrequency,

spectralenergy(EMG)[175],energy

(GSR)[208].

bandwidth,peakscount(GSR)[208].

Table6.1:Thesummarisationofthemostcommonlyusedfeaturesofeachwearablesensordataintheliterature.

6.2.3ModellingandLearningMethods

Appropriatedataprocessingtechniquesareessential,inordertomakesenseofthedata[212].Thissectionbrieflyoutlinesthemostcommondataminingalgorithmsusedformodellingandlearningsensordata.

NeuralnetworksNeuralNetwork(NN)isanartificialintelligenceapproachwhichiswidelyusedforclassificationandprediction[168].Duetothepre-dictiveperformanceofNN,itispresentlythemostpopulardatamodellingmethodusedinthemedicaldomain[38].TheNNisabletoprofoundlymodelnonlinearsystemssuchasphysiologicalrecordswherethecorrelationoftheinputparametersisnoteasilydetectable[21].Awiderangeofthediagno-sisanddecisionmakingtaskshasbeendonebyNNinthehealthcaresys-


Time Domain Spectral Domain Other Features

ECG

mean R-R, std R-R, meanHR, std HR [218], numberof R-R interval [141], meanR-R, std R-R interval [39]

spectral energy [117,141],power spectral density [222],

low-pass filter [101],low/high frequency [39,218]

-

SpO2

mean, zero crossing counts,entropy [232], mean,

slope [22],self-similarity [152]

energy, low frequency [152]drift from normal-ity range [22], en-tropy [152]

HRmean, slope [22], mean,self-similarity, std [152]

energy, low/highfrequency [152], low/highfrequency [208], wavelet

coefficients of datasegments [101], low/highfrequency, power spectral

density [60]

Drift from nor-mality range [22],Entropy, Co-occurrence coeffi-cients [152]

PPGrise times, max, min,

mean [208]low/high frequency [208] -

BP Mean, Slope [22] - rule based [13]

RR max, min, mean [39]. -residual and tidalvolume [39].

OtherSensorData

zero crossings count, peakvalue, rise time

(EMG) [175], mean,duration (GSR) [208], pickvalue, min, max (SCR) [81],

total magnitude, duration(GSR) [218].

spectral energy (EEG) [141],median and mean frequency,

spectral energy(EMG) [175], energy

(GSR) [208].

bandwidth,peaks count(GSR) [208].

Table 6.1: The summarisation of the most commonly used features of eachwearable sensor data in the literature.

6.2.3 Modelling and Learning Methods

Appropriate data processing techniques are essential, in order to make senseof the data [212]. This section briefly outlines the most common data miningalgorithms used for modelling and learning sensor data.

Neural networks Neural Network (NN) is an artificial intelligence approachwhich is widely used for classification and prediction [168]. Due to the pre-dictive performance of NN, it is presently the most popular data modellingmethod used in the medical domain [38]. The NN is able to profoundly modelnonlinear systems such as physiological records where the correlation of theinput parameters is not easily detectable [21]. A wide range of the diagno-sis and decision making tasks has been done by NN in the healthcare sys-


TimeDomainSpectralDomainOtherFeatures

ECG

meanR-R,stdR-R,meanHR,stdHR[218],numberofR-Rinterval[141],meanR-R,stdR-Rinterval[39]

spectralenergy[117,141],powerspectraldensity[222],

low-passfilter[101],low/highfrequency[39,218]

-

SpO2

mean,zerocrossingcounts,entropy[232],mean,

slope[22],self-similarity[152]

energy,lowfrequency[152]driftfromnormal-ityrange[22],en-tropy[152]

HRmean,slope[22],mean,self-similarity,std[152]

energy,low/highfrequency[152],low/highfrequency[208],wavelet

coefficientsofdatasegments[101],low/highfrequency,powerspectral

density[60]

Driftfromnor-malityrange[22],Entropy,Co-occurrencecoeffi-cients[152]

PPGrisetimes,max,min,

mean[208]low/highfrequency[208]-

BPMean,Slope[22]-rulebased[13]

RRmax,min,mean[39].-residualandtidalvolume[39].

OtherSensorData

zerocrossingscount,peakvalue,risetime

(EMG)[175],mean,duration(GSR)[208],pickvalue,min,max(SCR)[81],

totalmagnitude,duration(GSR)[218].

spectralenergy(EEG)[141],medianandmeanfrequency,

spectralenergy(EMG)[175],energy

(GSR)[208].

bandwidth,peakscount(GSR)[208].

Table6.1:Thesummarisationofthemostcommonlyusedfeaturesofeachwearablesensordataintheliterature.

6.2.3ModellingandLearningMethods

Appropriatedataprocessingtechniquesareessential,inordertomakesenseofthedata[212].Thissectionbrieflyoutlinesthemostcommondataminingalgorithmsusedformodellingandlearningsensordata.

NeuralnetworksNeuralNetwork(NN)isanartificialintelligenceapproachwhichiswidelyusedforclassificationandprediction[168].Duetothepre-dictiveperformanceofNN,itispresentlythemostpopulardatamodellingmethodusedinthemedicaldomain[38].TheNNisabletoprofoundlymodelnonlinearsystemssuchasphysiologicalrecordswherethecorrelationoftheinputparametersisnoteasilydetectable[21].Awiderangeofthediagno-sisanddecisionmakingtaskshasbeendonebyNNinthehealthcaresys-


tems [143,167]. Since the progress of learning in NN is to some extent compli-cated, the method is commonly used in clinical conditions with large and com-plicated data sets [101]. Also, as the modelling in NN is a black box progress,NN methods need to be adjusted for different input data [55].

Decision trees Decision tree is a vital learning technique which provides anefficient representation of rule classification [81]. The decision tree is a reli-able technique to use in different areas of the medical domain in order tomake a right decision [150, 173]. Nowadays, upon dealing with complex andnoisy sensor data, the C4.5 algorithm is the used characteristic version of thismethod [234]. More attention has been given to decision trees in the med-ical domain when short-term data with few numbers of subjects have beenused [40, 97, 100]. This method is also suitable for handling multivariate sen-sors due to the construction of independent levels in the decision tree [218].

Support vector machines Support vector machine (SVM) is a statistical learn-ing method to classify unseen information by deriving a high dimensional hy-perplane for the features in order to make a decision model [63]. Commonhealth parameters considered by SVM methods are ECG, HR, and SpO2 whichare mostly used in the short-term and annotated form [117, 141, 222]. Ingeneral, SVM techniques are often proposed for anomaly detection and deci-sion making tasks in healthcare services. However, SVM is not an appropriatemethod to integrate domain knowledge in order to use metadata or symbolicknowledge seamlessly with the sensor measurements [141].

Gaussian mixture models Gaussian mixture model (GMM) is a statistical ap-proach that used for classification and pattern recognition [227]. Studies usingGMM usually deal with annotated medical data in order to assess the per-formance of the model [61, 227]. Despite the fact that the GMM method candetect unseen information in physiological data, it has rarely been used forprediction tasks because the computation time of the constructing models isrelatively high [101].

Other methods Out of the considered methods, there are other data miningtechniques, which are roughly used in physiological data analysis. Some exam-ples are: Hidden Markov Models for anomaly detection task [26,243], Bayesiannetworks for prediction task [97, 143], Rule-based reasoning for anomaly(event) recognition [16,127], Fourier and wavelet transforms for mostly featureselection [100,128] and noise reduction in physiological data [74,195], and fi-nally Association rule mining for prediction and diagnosis tasks [114, 234].More details can be found in a literature review article [28] that have beendone by the author of this thesis.


tems[143,167].SincetheprogressoflearninginNNistosomeextentcompli-cated,themethodiscommonlyusedinclinicalconditionswithlargeandcom-plicateddatasets[101].Also,asthemodellinginNNisablackboxprogress,NNmethodsneedtobeadjustedfordifferentinputdata[55].

DecisiontreesDecisiontreeisavitallearningtechniquewhichprovidesanefficientrepresentationofruleclassification[81].Thedecisiontreeisareli-abletechniquetouseindifferentareasofthemedicaldomaininordertomakearightdecision[150,173].Nowadays,upondealingwithcomplexandnoisysensordata,theC4.5algorithmistheusedcharacteristicversionofthismethod[234].Moreattentionhasbeengiventodecisiontreesinthemed-icaldomainwhenshort-termdatawithfewnumbersofsubjectshavebeenused[40,97,100].Thismethodisalsosuitableforhandlingmultivariatesen-sorsduetotheconstructionofindependentlevelsinthedecisiontree[218].

SupportvectormachinesSupportvectormachine(SVM)isastatisticallearn-ingmethodtoclassifyunseeninformationbyderivingahighdimensionalhy-perplaneforthefeaturesinordertomakeadecisionmodel[63].CommonhealthparametersconsideredbySVMmethodsareECG,HR,andSpO2whicharemostlyusedintheshort-termandannotatedform[117,141,222].Ingeneral,SVMtechniquesareoftenproposedforanomalydetectionanddeci-sionmakingtasksinhealthcareservices.However,SVMisnotanappropriatemethodtointegratedomainknowledgeinordertousemetadataorsymbolicknowledgeseamlesslywiththesensormeasurements[141].

GaussianmixturemodelsGaussianmixturemodel(GMM)isastatisticalap-proachthatusedforclassificationandpatternrecognition[227].StudiesusingGMMusuallydealwithannotatedmedicaldatainordertoassesstheper-formanceofthemodel[61,227].DespitethefactthattheGMMmethodcandetectunseeninformationinphysiologicaldata,ithasrarelybeenusedforpredictiontasksbecausethecomputationtimeoftheconstructingmodelsisrelativelyhigh[101].

OthermethodsOutoftheconsideredmethods,thereareotherdataminingtechniques,whichareroughlyusedinphysiologicaldataanalysis.Someexam-plesare:HiddenMarkovModelsforanomalydetectiontask[26,243],Bayesiannetworksforpredictiontask[97,143],Rule-basedreasoningforanomaly(event)recognition[16,127],Fourierandwavelettransformsformostlyfeatureselection[100,128]andnoisereductioninphysiologicaldata[74,195],andfi-nallyAssociationruleminingforpredictionanddiagnosistasks[114,234].Moredetailscanbefoundinaliteraturereviewarticle[28]thathavebeendonebytheauthorofthisthesis.


tems [143,167]. Since the progress of learning in NN is to some extent compli-cated, the method is commonly used in clinical conditions with large and com-plicated data sets [101]. Also, as the modelling in NN is a black box progress,NN methods need to be adjusted for different input data [55].

Decision trees Decision tree is a vital learning technique which provides anefficient representation of rule classification [81]. The decision tree is a reli-able technique to use in different areas of the medical domain in order tomake a right decision [150, 173]. Nowadays, upon dealing with complex andnoisy sensor data, the C4.5 algorithm is the used characteristic version of thismethod [234]. More attention has been given to decision trees in the med-ical domain when short-term data with few numbers of subjects have beenused [40, 97, 100]. This method is also suitable for handling multivariate sen-sors due to the construction of independent levels in the decision tree [218].

Support vector machines Support vector machine (SVM) is a statistical learn-ing method to classify unseen information by deriving a high dimensional hy-perplane for the features in order to make a decision model [63]. Commonhealth parameters considered by SVM methods are ECG, HR, and SpO2 whichare mostly used in the short-term and annotated form [117, 141, 222]. Ingeneral, SVM techniques are often proposed for anomaly detection and deci-sion making tasks in healthcare services. However, SVM is not an appropriatemethod to integrate domain knowledge in order to use metadata or symbolicknowledge seamlessly with the sensor measurements [141].

Gaussian mixture models Gaussian mixture model (GMM) is a statistical ap-proach that used for classification and pattern recognition [227]. Studies usingGMM usually deal with annotated medical data in order to assess the per-formance of the model [61, 227]. Despite the fact that the GMM method candetect unseen information in physiological data, it has rarely been used forprediction tasks because the computation time of the constructing models isrelatively high [101].

Other methods Out of the considered methods, there are other data miningtechniques, which are roughly used in physiological data analysis. Some exam-ples are: Hidden Markov Models for anomaly detection task [26,243], Bayesiannetworks for prediction task [97, 143], Rule-based reasoning for anomaly(event) recognition [16,127], Fourier and wavelet transforms for mostly featureselection [100,128] and noise reduction in physiological data [74,195], and fi-nally Association rule mining for prediction and diagnosis tasks [114, 234].More details can be found in a literature review article [28] that have beendone by the author of this thesis.


tems[143,167].SincetheprogressoflearninginNNistosomeextentcompli-cated,themethodiscommonlyusedinclinicalconditionswithlargeandcom-plicateddatasets[101].Also,asthemodellinginNNisablackboxprogress,NNmethodsneedtobeadjustedfordifferentinputdata[55].

DecisiontreesDecisiontreeisavitallearningtechniquewhichprovidesanefficientrepresentationofruleclassification[81].Thedecisiontreeisareli-abletechniquetouseindifferentareasofthemedicaldomaininordertomakearightdecision[150,173].Nowadays,upondealingwithcomplexandnoisysensordata,theC4.5algorithmistheusedcharacteristicversionofthismethod[234].Moreattentionhasbeengiventodecisiontreesinthemed-icaldomainwhenshort-termdatawithfewnumbersofsubjectshavebeenused[40,97,100].Thismethodisalsosuitableforhandlingmultivariatesen-sorsduetotheconstructionofindependentlevelsinthedecisiontree[218].

SupportvectormachinesSupportvectormachine(SVM)isastatisticallearn-ingmethodtoclassifyunseeninformationbyderivingahighdimensionalhy-perplaneforthefeaturesinordertomakeadecisionmodel[63].CommonhealthparametersconsideredbySVMmethodsareECG,HR,andSpO2whicharemostlyusedintheshort-termandannotatedform[117,141,222].Ingeneral,SVMtechniquesareoftenproposedforanomalydetectionanddeci-sionmakingtasksinhealthcareservices.However,SVMisnotanappropriatemethodtointegratedomainknowledgeinordertousemetadataorsymbolicknowledgeseamlesslywiththesensormeasurements[141].

GaussianmixturemodelsGaussianmixturemodel(GMM)isastatisticalap-proachthatusedforclassificationandpatternrecognition[227].StudiesusingGMMusuallydealwithannotatedmedicaldatainordertoassesstheper-formanceofthemodel[61,227].DespitethefactthattheGMMmethodcandetectunseeninformationinphysiologicaldata,ithasrarelybeenusedforpredictiontasksbecausethecomputationtimeoftheconstructingmodelsisrelativelyhigh[101].

OthermethodsOutoftheconsideredmethods,thereareotherdataminingtechniques,whichareroughlyusedinphysiologicaldataanalysis.Someexam-plesare:HiddenMarkovModelsforanomalydetectiontask[26,243],Bayesiannetworksforpredictiontask[97,143],Rule-basedreasoningforanomaly(event)recognition[16,127],Fourierandwavelettransformsformostlyfeatureselection[100,128]andnoisereductioninphysiologicaldata[74,195],andfi-nallyAssociationruleminingforpredictionanddiagnosistasks[114,234].Moredetailscanbefoundinaliteraturereviewarticle[28]thathavebeendonebytheauthorofthisthesis.

6.3. DATA SETS: ACQUISITION AND PROPERTIES 99

6.3 Data Sets: Acquisition and Properties

In any health monitoring system, having a robust data processing stage requiresadequate information about the data itself. Knowing about the way of collect-ing data and its properties while recording process will allow the data analysissystem to perform the tasks such as selecting the proper data mining approach,designing new methods, and tuning the parameters. This section examines thetypes of data and the methods for acquiring data that have been used in dif-ferent experimental validations in the literature. This information gives the op-portunity to distinguish the data processing methods that are applied based onthe type of sensor data.

6.3.1 Sensor Data Acquisition

Several input sources and data acquisition methods have been considered in theliterature for wearable sensor data in health monitoring systems. Here, threemajor data gathering approaches have been identified:

Experimental wearable sensor data: Studies developed health monitoring sys-tems have mostly used their own data gathering experiments to design,model and test the data analysis step [97,200,205]. In this case, the gath-ered data are usually obtained based on the predefined scenarios due tothe test and evaluate the performed results [208]. But usually, these stud-ies do not provide the precise annotations and meaningful labels on phys-iological signals.

Clinical or online databases of sensor data: Despite the attention of the lit-erature is on the role of data mining on vital signs in health moni-toring, several studies in this area have used the stored clinical datasets [61, 195]. The most of the works used categorised and complexmultivariate data sets with formal definitions and annotations by do-main expert [101, 118, 153]. One common online database is the Phy-sioNet [4, 105] that consists a wide range of physiological data setswith categorised and robust annotations for complex clinical signals.Several studies in the literature have used two main data sets in Phys-ioNet bank, MIMIC data sets (e.g. [22, 152, 161]) and MIT data sets(e.g. [74, 141, 170]) that contain the time series of patients vital signsobtained from hospital medical information systems.

Simulated sensor data: For the sake of having a comprehensive controlledanalysis system, few works have designed and tested their data miningmethods through simulated physiological data [226]. Data simulationwould be useful when the focus of data processing method is on theefficiency and the robustness of the information extraction [227, 243],rather than on handling real-world data including artefact and errors.

6.3.DATASETS:ACQUISITIONANDPROPERTIES99

6.3DataSets:AcquisitionandProperties

Inanyhealthmonitoringsystem,havingarobustdataprocessingstagerequiresadequateinformationaboutthedataitself.Knowingaboutthewayofcollect-ingdataanditspropertieswhilerecordingprocesswillallowthedataanalysissystemtoperformthetaskssuchasselectingtheproperdataminingapproach,designingnewmethods,andtuningtheparameters.Thissectionexaminesthetypesofdataandthemethodsforacquiringdatathathavebeenusedindif-ferentexperimentalvalidationsintheliterature.Thisinformationgivestheop-portunitytodistinguishthedataprocessingmethodsthatareappliedbasedonthetypeofsensordata.

6.3.1SensorDataAcquisition

Severalinputsourcesanddataacquisitionmethodshavebeenconsideredintheliteratureforwearablesensordatainhealthmonitoringsystems.Here,threemajordatagatheringapproacheshavebeenidentified:

Experimentalwearablesensordata:Studiesdevelopedhealthmonitoringsys-temshavemostlyusedtheirowndatagatheringexperimentstodesign,modelandtestthedataanalysisstep[97,200,205].Inthiscase,thegath-ereddataareusuallyobtainedbasedonthepredefinedscenariosduetothetestandevaluatetheperformedresults[208].Butusually,thesestud-iesdonotprovidethepreciseannotationsandmeaningfullabelsonphys-iologicalsignals.

Clinicaloronlinedatabasesofsensordata:Despitetheattentionofthelit-eratureisontheroleofdataminingonvitalsignsinhealthmoni-toring,severalstudiesinthisareahaveusedthestoredclinicaldatasets[61,195].Themostoftheworksusedcategorisedandcomplexmultivariatedatasetswithformaldefinitionsandannotationsbydo-mainexpert[101,118,153].OnecommononlinedatabaseisthePhy-sioNet[4,105]thatconsistsawiderangeofphysiologicaldatasetswithcategorisedandrobustannotationsforcomplexclinicalsignals.SeveralstudiesintheliteraturehaveusedtwomaindatasetsinPhys-ioNetbank,MIMICdatasets(e.g.[22,152,161])andMITdatasets(e.g.[74,141,170])thatcontainthetimeseriesofpatientsvitalsignsobtainedfromhospitalmedicalinformationsystems.

Simulatedsensordata:Forthesakeofhavingacomprehensivecontrolledanalysissystem,fewworkshavedesignedandtestedtheirdataminingmethodsthroughsimulatedphysiologicaldata[226].Datasimulationwouldbeusefulwhenthefocusofdataprocessingmethodisontheefficiencyandtherobustnessoftheinformationextraction[227,243],ratherthanonhandlingreal-worlddataincludingartefactanderrors.

6.3. DATA SETS: ACQUISITION AND PROPERTIES 99

6.3 Data Sets: Acquisition and Properties

In any health monitoring system, having a robust data processing stage requiresadequate information about the data itself. Knowing about the way of collect-ing data and its properties while recording process will allow the data analysissystem to perform the tasks such as selecting the proper data mining approach,designing new methods, and tuning the parameters. This section examines thetypes of data and the methods for acquiring data that have been used in dif-ferent experimental validations in the literature. This information gives the op-portunity to distinguish the data processing methods that are applied based onthe type of sensor data.

6.3.1 Sensor Data Acquisition

Several input sources and data acquisition methods have been considered in theliterature for wearable sensor data in health monitoring systems. Here, threemajor data gathering approaches have been identified:

Experimental wearable sensor data: Studies developed health monitoring sys-tems have mostly used their own data gathering experiments to design,model and test the data analysis step [97,200,205]. In this case, the gath-ered data are usually obtained based on the predefined scenarios due tothe test and evaluate the performed results [208]. But usually, these stud-ies do not provide the precise annotations and meaningful labels on phys-iological signals.

Clinical or online databases of sensor data: Despite the attention of the lit-erature is on the role of data mining on vital signs in health moni-toring, several studies in this area have used the stored clinical datasets [61, 195]. The most of the works used categorised and complexmultivariate data sets with formal definitions and annotations by do-main expert [101, 118, 153]. One common online database is the Phy-sioNet [4, 105] that consists a wide range of physiological data setswith categorised and robust annotations for complex clinical signals.Several studies in the literature have used two main data sets in Phys-ioNet bank, MIMIC data sets (e.g. [22, 152, 161]) and MIT data sets(e.g. [74, 141, 170]) that contain the time series of patients vital signsobtained from hospital medical information systems.

Simulated sensor data: For the sake of having a comprehensive controlledanalysis system, few works have designed and tested their data miningmethods through simulated physiological data [226]. Data simulationwould be useful when the focus of data processing method is on theefficiency and the robustness of the information extraction [227, 243],rather than on handling real-world data including artefact and errors.

6.3.DATASETS:ACQUISITIONANDPROPERTIES99

6.3DataSets:AcquisitionandProperties

Inanyhealthmonitoringsystem,havingarobustdataprocessingstagerequiresadequateinformationaboutthedataitself.Knowingaboutthewayofcollect-ingdataanditspropertieswhilerecordingprocesswillallowthedataanalysissystemtoperformthetaskssuchasselectingtheproperdataminingapproach,designingnewmethods,andtuningtheparameters.Thissectionexaminesthetypesofdataandthemethodsforacquiringdatathathavebeenusedindif-ferentexperimentalvalidationsintheliterature.Thisinformationgivestheop-portunitytodistinguishthedataprocessingmethodsthatareappliedbasedonthetypeofsensordata.

6.3.1SensorDataAcquisition

Severalinputsourcesanddataacquisitionmethodshavebeenconsideredintheliteratureforwearablesensordatainhealthmonitoringsystems.Here,threemajordatagatheringapproacheshavebeenidentified:

Experimentalwearablesensordata:Studiesdevelopedhealthmonitoringsys-temshavemostlyusedtheirowndatagatheringexperimentstodesign,modelandtestthedataanalysisstep[97,200,205].Inthiscase,thegath-ereddataareusuallyobtainedbasedonthepredefinedscenariosduetothetestandevaluatetheperformedresults[208].Butusually,thesestud-iesdonotprovidethepreciseannotationsandmeaningfullabelsonphys-iologicalsignals.

Clinicaloronlinedatabasesofsensordata:Despitetheattentionofthelit-eratureisontheroleofdataminingonvitalsignsinhealthmoni-toring,severalstudiesinthisareahaveusedthestoredclinicaldatasets[61,195].Themostoftheworksusedcategorisedandcomplexmultivariatedatasetswithformaldefinitionsandannotationsbydo-mainexpert[101,118,153].OnecommononlinedatabaseisthePhy-sioNet[4,105]thatconsistsawiderangeofphysiologicaldatasetswithcategorisedandrobustannotationsforcomplexclinicalsignals.SeveralstudiesintheliteraturehaveusedtwomaindatasetsinPhys-ioNetbank,MIMICdatasets(e.g.[22,152,161])andMITdatasets(e.g.[74,141,170])thatcontainthetimeseriesofpatientsvitalsignsobtainedfromhospitalmedicalinformationsystems.

Simulatedsensordata:Forthesakeofhavingacomprehensivecontrolledanalysissystem,fewworkshavedesignedandtestedtheirdataminingmethodsthroughsimulatedphysiologicaldata[226].Datasimulationwouldbeusefulwhenthefocusofdataprocessingmethodisontheefficiencyandtherobustnessoftheinformationextraction[227,243],ratherthanonhandlingreal-worlddataincludingartefactanderrors.


Another reason to create and use simulated data is the lack of long-termand large-scale data sets [243], where simulated data helps the proposeddata mining systems to work with huge amount of data.

6.3.2 Sensor Data Properties

In addition to data acquisition methods, the following properties of wearablesensor data have been collected from the literature. These properties presentwhich kind of data sets are usually used in healthcare systems.

Time horizon (long-term/short-term): The length of time for considering dataset measurements is a particular challenge for wearable sensor data inorder to orientate data mining techniques and the manner of data inter-pretation. Here, the time horizon of considered sensor data is categorisedto short-term and long-term data. Some data analysis systems in health-care were designed to process short signals such as a few minutes of ECGdata [123,170], a few hours of heart rate or oxygen saturation [22,152]or the measurement of blood pressures over a day [13]. On the otherhand, dealing with long-term data is a significant portion of some datamining methods for handling and processing. This period could be longerthan a number of days or even a year of measurements. Blood glucosemonitoring is an example of long-term data analysis for the sake of rightdecision making [55,243].

Scale (large/small): A big challenge of data analysis in any health monitor-ing system is the examination of the proposed method on more than anindividual. Depending on the design of sensor network, data gathering,and the goal of decision making, the scale of subjects in the frameworkswould differ. Here, the studies considering 50 or more subjects (patientor healthy) are counted as large-scale studies [118], which can handle thesame data modelling for large-scale of monitoring [61].

Labelling (annotated/unlabelled): Health monitoring systems need to evalu-ate their results to show the correctness of the decision making process.Due to having significant data analysis step, the attention of the most re-search is given to annotated data. By considering the behaviour of vitalsigns the domain expert is able to mark the data with several annota-tions such as arrhythmia disease [141], sleep discords [43], the severity ofhealth [22], stress levels [208], and abnormal pulse in ECG [222]. Theseannotations also acquired using another source of knowledge like elec-tronic health record (EHR), coronary syndromes, and also the historyof vital signs [153]. On the other hand, working with unlabelled dataleads to having unsupervised learning methods to extract unseen knowl-edge among raw sensor data. Some studies have been done on unlabelled


Anotherreasontocreateandusesimulateddataisthelackoflong-termandlarge-scaledatasets[243],wheresimulateddatahelpstheproposeddataminingsystemstoworkwithhugeamountofdata.

6.3.2SensorDataProperties

Inadditiontodataacquisitionmethods,thefollowingpropertiesofwearablesensordatahavebeencollectedfromtheliterature.Thesepropertiespresentwhichkindofdatasetsareusuallyusedinhealthcaresystems.

Timehorizon(long-term/short-term):Thelengthoftimeforconsideringdatasetmeasurementsisaparticularchallengeforwearablesensordatainordertoorientatedataminingtechniquesandthemannerofdatainter-pretation.Here,thetimehorizonofconsideredsensordataiscategorisedtoshort-termandlong-termdata.Somedataanalysissystemsinhealth-careweredesignedtoprocessshortsignalssuchasafewminutesofECGdata[123,170],afewhoursofheartrateoroxygensaturation[22,152]orthemeasurementofbloodpressuresoveraday[13].Ontheotherhand,dealingwithlong-termdataisasignificantportionofsomedataminingmethodsforhandlingandprocessing.Thisperiodcouldbelongerthananumberofdaysorevenayearofmeasurements.Bloodglucosemonitoringisanexampleoflong-termdataanalysisforthesakeofrightdecisionmaking[55,243].

Scale(large/small):Abigchallengeofdataanalysisinanyhealthmonitor-ingsystemistheexaminationoftheproposedmethodonmorethananindividual.Dependingonthedesignofsensornetwork,datagathering,andthegoalofdecisionmaking,thescaleofsubjectsintheframeworkswoulddiffer.Here,thestudiesconsidering50ormoresubjects(patientorhealthy)arecountedaslarge-scalestudies[118],whichcanhandlethesamedatamodellingforlarge-scaleofmonitoring[61].

Labelling(annotated/unlabelled):Healthmonitoringsystemsneedtoevalu-atetheirresultstoshowthecorrectnessofthedecisionmakingprocess.Duetohavingsignificantdataanalysisstep,theattentionofthemostre-searchisgiventoannotateddata.Byconsideringthebehaviourofvitalsignsthedomainexpertisabletomarkthedatawithseveralannota-tionssuchasarrhythmiadisease[141],sleepdiscords[43],theseverityofhealth[22],stresslevels[208],andabnormalpulseinECG[222].Theseannotationsalsoacquiredusinganothersourceofknowledgelikeelec-tronichealthrecord(EHR),coronarysyndromes,andalsothehistoryofvitalsigns[153].Ontheotherhand,workingwithunlabelleddataleadstohavingunsupervisedlearningmethodstoextractunseenknowl-edgeamongrawsensordata.Somestudieshavebeendoneonunlabelled


Another reason to create and use simulated data is the lack of long-termand large-scale data sets [243], where simulated data helps the proposeddata mining systems to work with huge amount of data.

6.3.2 Sensor Data Properties

In addition to data acquisition methods, the following properties of wearablesensor data have been collected from the literature. These properties presentwhich kind of data sets are usually used in healthcare systems.

Time horizon (long-term/short-term): The length of time for considering dataset measurements is a particular challenge for wearable sensor data inorder to orientate data mining techniques and the manner of data inter-pretation. Here, the time horizon of considered sensor data is categorisedto short-term and long-term data. Some data analysis systems in health-care were designed to process short signals such as a few minutes of ECGdata [123,170], a few hours of heart rate or oxygen saturation [22,152]or the measurement of blood pressures over a day [13]. On the otherhand, dealing with long-term data is a significant portion of some datamining methods for handling and processing. This period could be longerthan a number of days or even a year of measurements. Blood glucosemonitoring is an example of long-term data analysis for the sake of rightdecision making [55,243].

Scale (large/small): A big challenge of data analysis in any health monitor-ing system is the examination of the proposed method on more than anindividual. Depending on the design of sensor network, data gathering,and the goal of decision making, the scale of subjects in the frameworkswould differ. Here, the studies considering 50 or more subjects (patientor healthy) are counted as large-scale studies [118], which can handle thesame data modelling for large-scale of monitoring [61].

Labelling (annotated/unlabelled): Health monitoring systems need to evalu-ate their results to show the correctness of the decision making process.Due to having significant data analysis step, the attention of the most re-search is given to annotated data. By considering the behaviour of vitalsigns the domain expert is able to mark the data with several annota-tions such as arrhythmia disease [141], sleep discords [43], the severity ofhealth [22], stress levels [208], and abnormal pulse in ECG [222]. Theseannotations also acquired using another source of knowledge like elec-tronic health record (EHR), coronary syndromes, and also the historyof vital signs [153]. On the other hand, working with unlabelled dataleads to having unsupervised learning methods to extract unseen knowl-edge among raw sensor data. Some studies have been done on unlabelled


Anotherreasontocreateandusesimulateddataisthelackoflong-termandlarge-scaledatasets[243],wheresimulateddatahelpstheproposeddataminingsystemstoworkwithhugeamountofdata.

6.3.2SensorDataProperties

Inadditiontodataacquisitionmethods,thefollowingpropertiesofwearablesensordatahavebeencollectedfromtheliterature.Thesepropertiespresentwhichkindofdatasetsareusuallyusedinhealthcaresystems.

Timehorizon(long-term/short-term):Thelengthoftimeforconsideringdatasetmeasurementsisaparticularchallengeforwearablesensordatainordertoorientatedataminingtechniquesandthemannerofdatainter-pretation.Here,thetimehorizonofconsideredsensordataiscategorisedtoshort-termandlong-termdata.Somedataanalysissystemsinhealth-careweredesignedtoprocessshortsignalssuchasafewminutesofECGdata[123,170],afewhoursofheartrateoroxygensaturation[22,152]orthemeasurementofbloodpressuresoveraday[13].Ontheotherhand,dealingwithlong-termdataisasignificantportionofsomedataminingmethodsforhandlingandprocessing.Thisperiodcouldbelongerthananumberofdaysorevenayearofmeasurements.Bloodglucosemonitoringisanexampleoflong-termdataanalysisforthesakeofrightdecisionmaking[55,243].

Scale(large/small):Abigchallengeofdataanalysisinanyhealthmonitor-ingsystemistheexaminationoftheproposedmethodonmorethananindividual.Dependingonthedesignofsensornetwork,datagathering,andthegoalofdecisionmaking,thescaleofsubjectsintheframeworkswoulddiffer.Here,thestudiesconsidering50ormoresubjects(patientorhealthy)arecountedaslarge-scalestudies[118],whichcanhandlethesamedatamodellingforlarge-scaleofmonitoring[61].

Labelling(annotated/unlabelled):Healthmonitoringsystemsneedtoevalu-atetheirresultstoshowthecorrectnessofthedecisionmakingprocess.Duetohavingsignificantdataanalysisstep,theattentionofthemostre-searchisgiventoannotateddata.Byconsideringthebehaviourofvitalsignsthedomainexpertisabletomarkthedatawithseveralannota-tionssuchasarrhythmiadisease[141],sleepdiscords[43],theseverityofhealth[22],stresslevels[208],andabnormalpulseinECG[222].Theseannotationsalsoacquiredusinganothersourceofknowledgelikeelec-tronichealthrecord(EHR),coronarysyndromes,andalsothehistoryofvitalsigns[153].Ontheotherhand,workingwithunlabelleddataleadstohavingunsupervisedlearningmethodstoextractunseenknowl-edgeamongrawsensordata.Somestudieshavebeendoneonunlabelled

6.4. DISCUSSION AND CHALLENGES 101

data to consider uncontrolled situations for especially experimental datasets [41,200,218].

Single Sensor/Multiple Sensors: Commonly, single sensor data have been usedfor specific analysis on individual physiological data such as ECG signalanalysis [74, 170, 222] or blood glucose monitoring [55, 243]. Besides,some of the researchers have used several sensors [41, 118, 144, 153]to have global reasoning in health monitoring. Although using severalwearable sensors in health monitoring frameworks are common, but fewpieces of research have performed the multivariate data analysis to ex-tract useful information through multi sensor data [118].

6.4 Discussion and Challenges

This chapter has presented an overview of the data mining approaches usedfor analysing the measured vital signs from the wearable sensing devices. Foreach approach, a reflection of its suitability for health monitoring was pro-vided. From these reflections, the following guidelines for applying data miningmethods were extracted:

• The selected data mining technique is highly dependent on which datamining task is in focus. According to the data mining tasks mentionedin Section 6.1, for anomaly detection task, SVM, HMM, statistical toolsand frequency analysis are more commonly applied. Prediction tasks haveoften used decision tree methods as well as other supervised techniques.But rule-based methods, GMM, and frequency analysis have not beenemployed for prediction due to the shortcoming in modelling the databehaviours. Finally, any decision making task needs a modelling andinferring system with considering the contextual information. So, non-statistical methods like SVM, NN, and decision tree techniques have usu-ally applied with success.

• The requirements for a real-time system should guide the selection of thedata mining methods. To design a real-time health monitoring system,such methods like NN, GMM, and frequency analysis are not efficient forthe sake of their computational complexities. But simple methods such asrule-based, decision tree and statistical techniques can quickly handle theonline data processing requirements.

• The properties of the data set and experimental condition also influencethe choice of method. Data mining methods (e.g., rule-based, decisiontree) have been used in the clinical situation with the controlled con-ditions and clear data sets, but the efficiency of them are not tested inreal experiments of healthcare services. In contrast, NN, HMM, and fre-quency techniques have been used to handle complex physiological dataand discover the unexpected patterns in real-world situations.

6.4.DISCUSSIONANDCHALLENGES101

datatoconsideruncontrolledsituationsforespeciallyexperimentaldatasets[41,200,218].

SingleSensor/MultipleSensors:Commonly,singlesensordatahavebeenusedforspecificanalysisonindividualphysiologicaldatasuchasECGsignalanalysis[74,170,222]orbloodglucosemonitoring[55,243].Besides,someoftheresearchershaveusedseveralsensors[41,118,144,153]tohaveglobalreasoninginhealthmonitoring.Althoughusingseveralwearablesensorsinhealthmonitoringframeworksarecommon,butfewpiecesofresearchhaveperformedthemultivariatedataanalysistoex-tractusefulinformationthroughmultisensordata[118].

6.4DiscussionandChallenges

Thischapterhaspresentedanoverviewofthedataminingapproachesusedforanalysingthemeasuredvitalsignsfromthewearablesensingdevices.Foreachapproach,areflectionofitssuitabilityforhealthmonitoringwaspro-vided.Fromthesereflections,thefollowingguidelinesforapplyingdataminingmethodswereextracted:

•Theselecteddataminingtechniqueishighlydependentonwhichdataminingtaskisinfocus.AccordingtothedataminingtasksmentionedinSection6.1,foranomalydetectiontask,SVM,HMM,statisticaltoolsandfrequencyanalysisaremorecommonlyapplied.Predictiontaskshaveoftenuseddecisiontreemethodsaswellasothersupervisedtechniques.Butrule-basedmethods,GMM,andfrequencyanalysishavenotbeenemployedforpredictionduetotheshortcominginmodellingthedatabehaviours.Finally,anydecisionmakingtaskneedsamodellingandinferringsystemwithconsideringthecontextualinformation.So,non-statisticalmethodslikeSVM,NN,anddecisiontreetechniqueshaveusu-allyappliedwithsuccess.

•Therequirementsforareal-timesystemshouldguidetheselectionofthedataminingmethods.Todesignareal-timehealthmonitoringsystem,suchmethodslikeNN,GMM,andfrequencyanalysisarenotefficientforthesakeoftheircomputationalcomplexities.Butsimplemethodssuchasrule-based,decisiontreeandstatisticaltechniquescanquicklyhandletheonlinedataprocessingrequirements.

•Thepropertiesofthedatasetandexperimentalconditionalsoinfluencethechoiceofmethod.Dataminingmethods(e.g.,rule-based,decisiontree)havebeenusedintheclinicalsituationwiththecontrolledcon-ditionsandcleardatasets,buttheefficiencyofthemarenottestedinrealexperimentsofhealthcareservices.Incontrast,NN,HMM,andfre-quencytechniqueshavebeenusedtohandlecomplexphysiologicaldataanddiscovertheunexpectedpatternsinreal-worldsituations.


data to consider uncontrolled situations for especially experimental datasets [41,200,218].

Single Sensor/Multiple Sensors: Commonly, single sensor data have been usedfor specific analysis on individual physiological data such as ECG signalanalysis [74, 170, 222] or blood glucose monitoring [55, 243]. Besides,some of the researchers have used several sensors [41, 118, 144, 153]to have global reasoning in health monitoring. Although using severalwearable sensors in health monitoring frameworks are common, but fewpieces of research have performed the multivariate data analysis to ex-tract useful information through multi sensor data [118].

6.4 Discussion and Challenges

This chapter has presented an overview of the data mining approaches usedfor analysing the measured vital signs from the wearable sensing devices. Foreach approach, a reflection of its suitability for health monitoring was pro-vided. From these reflections, the following guidelines for applying data miningmethods were extracted:

• The selected data mining technique is highly dependent on which datamining task is in focus. According to the data mining tasks mentionedin Section 6.1, for anomaly detection task, SVM, HMM, statistical toolsand frequency analysis are more commonly applied. Prediction tasks haveoften used decision tree methods as well as other supervised techniques.But rule-based methods, GMM, and frequency analysis have not beenemployed for prediction due to the shortcoming in modelling the databehaviours. Finally, any decision making task needs a modelling andinferring system with considering the contextual information. So, non-statistical methods like SVM, NN, and decision tree techniques have usu-ally applied with success.

• The requirements for a real-time system should guide the selection of thedata mining methods. To design a real-time health monitoring system,such methods like NN, GMM, and frequency analysis are not efficient forthe sake of their computational complexities. But simple methods such asrule-based, decision tree and statistical techniques can quickly handle theonline data processing requirements.

• The properties of the data set and experimental condition also influencethe choice of method. Data mining methods (e.g., rule-based, decisiontree) have been used in the clinical situation with the controlled con-ditions and clear data sets, but the efficiency of them are not tested inreal experiments of healthcare services. In contrast, NN, HMM, and fre-quency techniques have been used to handle complex physiological dataand discover the unexpected patterns in real-world situations.


datatoconsideruncontrolledsituationsforespeciallyexperimentaldatasets[41,200,218].

SingleSensor/MultipleSensors:Commonly,singlesensordatahavebeenusedforspecificanalysisonindividualphysiologicaldatasuchasECGsignalanalysis[74,170,222]orbloodglucosemonitoring[55,243].Besides,someoftheresearchershaveusedseveralsensors[41,118,144,153]tohaveglobalreasoninginhealthmonitoring.Althoughusingseveralwearablesensorsinhealthmonitoringframeworksarecommon,butfewpiecesofresearchhaveperformedthemultivariatedataanalysistoex-tractusefulinformationthroughmultisensordata[118].

6.4DiscussionandChallenges

Thischapterhaspresentedanoverviewofthedataminingapproachesusedforanalysingthemeasuredvitalsignsfromthewearablesensingdevices.Foreachapproach,areflectionofitssuitabilityforhealthmonitoringwaspro-vided.Fromthesereflections,thefollowingguidelinesforapplyingdataminingmethodswereextracted:

•Theselecteddataminingtechniqueishighlydependentonwhichdataminingtaskisinfocus.AccordingtothedataminingtasksmentionedinSection6.1,foranomalydetectiontask,SVM,HMM,statisticaltoolsandfrequencyanalysisaremorecommonlyapplied.Predictiontaskshaveoftenuseddecisiontreemethodsaswellasothersupervisedtechniques.Butrule-basedmethods,GMM,andfrequencyanalysishavenotbeenemployedforpredictionduetotheshortcominginmodellingthedatabehaviours.Finally,anydecisionmakingtaskneedsamodellingandinferringsystemwithconsideringthecontextualinformation.So,non-statisticalmethodslikeSVM,NN,anddecisiontreetechniqueshaveusu-allyappliedwithsuccess.

•Therequirementsforareal-timesystemshouldguidetheselectionofthedataminingmethods.Todesignareal-timehealthmonitoringsystem,suchmethodslikeNN,GMM,andfrequencyanalysisarenotefficientforthesakeoftheircomputationalcomplexities.Butsimplemethodssuchasrule-based,decisiontreeandstatisticaltechniquescanquicklyhandletheonlinedataprocessingrequirements.

•Thepropertiesofthedatasetandexperimentalconditionalsoinfluencethechoiceofmethod.Dataminingmethods(e.g.,rule-based,decisiontree)havebeenusedintheclinicalsituationwiththecontrolledcon-ditionsandcleardatasets,buttheefficiencyofthemarenottestedinrealexperimentsofhealthcareservices.Incontrast,NN,HMM,andfre-quencytechniqueshavebeenusedtohandlecomplexphysiologicaldataanddiscovertheunexpectedpatternsinreal-worldsituations.


• The level of supervision and labelled data is a crucial factor. Methodslike SVM, NN, and GMM have been designed and justified to modellong-term data. However, they could not deal with the unlabelled data tomodel the raw data in an unsupervised manner. For multivariate analysisof sensor data, the methods such as rule-based, decision tree, and statisti-cal tools are more usable, while, e.g., GMM and HMM cannot play thisrole in healthcare systems.

While these guidelines can assist implementing technical systems to selectappropriate methods for data analysis, the field is still challenged by some fac-tors which have been discussed below. These are general challenges in the areaof health monitoring and have been emerged from the literature studies inves-tigated in this chapter. They include:

Need for large-scale monitoring: One challenge is still many applications us-ing more massive data sets, and also still consider the monitoring task in theclinical contexts. This challenge will become more important for applicationswhich examine target groups such as elderly, healthy persons etc. to make thesignificant effort in collecting reliable data sets for processing.

Dealing with annotated data sets: Data mining approaches gain increasingattention in this field, open data sets, as well as benchmark data sets, becomeessential to validate different approaches. Still however in this area, few bench-mark data sets are available. The last mentioned point raises a second challengeof how data annotation (labelling) can be best done for such target groups. Theprocess of annotating data is expensive, time-consuming and non-trivial con-sidering long-term continuous data. To confront this challenge, an exciting av-enue of study will be the efficacy of data mining in unsupervised contexts usingunlabelled data sets. This type of data mining applies both to the modelling aswell as eventual preprocessing of data, where for example, unsupervised featurelearning techniques [136] for time series data could show promise.

Multiple measurements: Another challenge in this field is to exploit the mul-tiple measurements of vital signs simultaneously. In particular, sensor fusiontechniques which are able to consider dependencies and correlations betweendifferent vital sign parameters could assist in performing the primary data min-ing tasks of prediction, decision making and anomaly detection. Some attentionto this issue has been given in the literature such as [118].

Contextual information: The usage of contextual information to assist in datamining is of ever increasing importance. Such contextual information couldinclude meta information about subjects such as weight, height, age, sex, his-tory of vital signs, as well as the history of previous decisions. It is also pos-


•Thelevelofsupervisionandlabelleddataisacrucialfactor.MethodslikeSVM,NN,andGMMhavebeendesignedandjustifiedtomodellong-termdata.However,theycouldnotdealwiththeunlabelleddatatomodeltherawdatainanunsupervisedmanner.Formultivariateanalysisofsensordata,themethodssuchasrule-based,decisiontree,andstatisti-caltoolsaremoreusable,while,e.g.,GMMandHMMcannotplaythisroleinhealthcaresystems.

Whiletheseguidelinescanassistimplementingtechnicalsystemstoselectappropriatemethodsfordataanalysis,thefieldisstillchallengedbysomefac-torswhichhavebeendiscussedbelow.Thesearegeneralchallengesintheareaofhealthmonitoringandhavebeenemergedfromtheliteraturestudiesinves-tigatedinthischapter.Theyinclude:

Needforlarge-scalemonitoring:Onechallengeisstillmanyapplicationsus-ingmoremassivedatasets,andalsostillconsiderthemonitoringtaskintheclinicalcontexts.Thischallengewillbecomemoreimportantforapplicationswhichexaminetargetgroupssuchaselderly,healthypersonsetc.tomakethesignificanteffortincollectingreliabledatasetsforprocessing.

Dealingwithannotateddatasets:Dataminingapproachesgainincreasingattentioninthisfield,opendatasets,aswellasbenchmarkdatasets,becomeessentialtovalidatedifferentapproaches.Stillhoweverinthisarea,fewbench-markdatasetsareavailable.Thelastmentionedpointraisesasecondchallengeofhowdataannotation(labelling)canbebestdoneforsuchtargetgroups.Theprocessofannotatingdataisexpensive,time-consumingandnon-trivialcon-sideringlong-termcontinuousdata.Toconfrontthischallenge,anexcitingav-enueofstudywillbetheefficacyofdatamininginunsupervisedcontextsusingunlabelleddatasets.Thistypeofdataminingappliesbothtothemodellingaswellaseventualpreprocessingofdata,whereforexample,unsupervisedfeaturelearningtechniques[136]fortimeseriesdatacouldshowpromise.

Multiplemeasurements:Anotherchallengeinthisfieldistoexploitthemul-tiplemeasurementsofvitalsignssimultaneously.Inparticular,sensorfusiontechniqueswhichareabletoconsiderdependenciesandcorrelationsbetweendifferentvitalsignparameterscouldassistinperformingtheprimarydatamin-ingtasksofprediction,decisionmakingandanomalydetection.Someattentiontothisissuehasbeengivenintheliteraturesuchas[118].

Contextualinformation:Theusageofcontextualinformationtoassistindataminingisofeverincreasingimportance.Suchcontextualinformationcouldincludemetainformationaboutsubjectssuchasweight,height,age,sex,his-toryofvitalsigns,aswellasthehistoryofpreviousdecisions.Itisalsopos-


• The level of supervision and labelled data is a crucial factor. Methodslike SVM, NN, and GMM have been designed and justified to modellong-term data. However, they could not deal with the unlabelled data tomodel the raw data in an unsupervised manner. For multivariate analysisof sensor data, the methods such as rule-based, decision tree, and statisti-cal tools are more usable, while, e.g., GMM and HMM cannot play thisrole in healthcare systems.

While these guidelines can assist implementing technical systems to selectappropriate methods for data analysis, the field is still challenged by some fac-tors which have been discussed below. These are general challenges in the areaof health monitoring and have been emerged from the literature studies inves-tigated in this chapter. They include:

Need for large-scale monitoring: One challenge is still many applications us-ing more massive data sets, and also still consider the monitoring task in theclinical contexts. This challenge will become more important for applicationswhich examine target groups such as elderly, healthy persons etc. to make thesignificant effort in collecting reliable data sets for processing.

Dealing with annotated data sets: Data mining approaches gain increasingattention in this field, open data sets, as well as benchmark data sets, becomeessential to validate different approaches. Still however in this area, few bench-mark data sets are available. The last mentioned point raises a second challengeof how data annotation (labelling) can be best done for such target groups. Theprocess of annotating data is expensive, time-consuming and non-trivial con-sidering long-term continuous data. To confront this challenge, an exciting av-enue of study will be the efficacy of data mining in unsupervised contexts usingunlabelled data sets. This type of data mining applies both to the modelling aswell as eventual preprocessing of data, where for example, unsupervised featurelearning techniques [136] for time series data could show promise.

Multiple measurements: Another challenge in this field is to exploit the mul-tiple measurements of vital signs simultaneously. In particular, sensor fusiontechniques which are able to consider dependencies and correlations betweendifferent vital sign parameters could assist in performing the primary data min-ing tasks of prediction, decision making and anomaly detection. Some attentionto this issue has been given in the literature such as [118].

Contextual information: The usage of contextual information to assist in datamining is of ever increasing importance. Such contextual information couldinclude meta information about subjects such as weight, height, age, sex, his-tory of vital signs, as well as the history of previous decisions. It is also pos-


•Thelevelofsupervisionandlabelleddataisacrucialfactor.MethodslikeSVM,NN,andGMMhavebeendesignedandjustifiedtomodellong-termdata.However,theycouldnotdealwiththeunlabelleddatatomodeltherawdatainanunsupervisedmanner.Formultivariateanalysisofsensordata,themethodssuchasrule-based,decisiontree,andstatisti-caltoolsaremoreusable,while,e.g.,GMMandHMMcannotplaythisroleinhealthcaresystems.

Whiletheseguidelinescanassistimplementingtechnicalsystemstoselectappropriatemethodsfordataanalysis,thefieldisstillchallengedbysomefac-torswhichhavebeendiscussedbelow.Thesearegeneralchallengesintheareaofhealthmonitoringandhavebeenemergedfromtheliteraturestudiesinves-tigatedinthischapter.Theyinclude:

Needforlarge-scalemonitoring:Onechallengeisstillmanyapplicationsus-ingmoremassivedatasets,andalsostillconsiderthemonitoringtaskintheclinicalcontexts.Thischallengewillbecomemoreimportantforapplicationswhichexaminetargetgroupssuchaselderly,healthypersonsetc.tomakethesignificanteffortincollectingreliabledatasetsforprocessing.

Dealingwithannotateddatasets:Dataminingapproachesgainincreasingattentioninthisfield,opendatasets,aswellasbenchmarkdatasets,becomeessentialtovalidatedifferentapproaches.Stillhoweverinthisarea,fewbench-markdatasetsareavailable.Thelastmentionedpointraisesasecondchallengeofhowdataannotation(labelling)canbebestdoneforsuchtargetgroups.Theprocessofannotatingdataisexpensive,time-consumingandnon-trivialcon-sideringlong-termcontinuousdata.Toconfrontthischallenge,anexcitingav-enueofstudywillbetheefficacyofdatamininginunsupervisedcontextsusingunlabelleddatasets.Thistypeofdataminingappliesbothtothemodellingaswellaseventualpreprocessingofdata,whereforexample,unsupervisedfeaturelearningtechniques[136]fortimeseriesdatacouldshowpromise.

Multiplemeasurements:Anotherchallengeinthisfieldistoexploitthemul-tiplemeasurementsofvitalsignssimultaneously.Inparticular,sensorfusiontechniqueswhichareabletoconsiderdependenciesandcorrelationsbetweendifferentvitalsignparameterscouldassistinperformingtheprimarydatamin-ingtasksofprediction,decisionmakingandanomalydetection.Someattentiontothisissuehasbeengivenintheliteraturesuchas[118].

Contextualinformation:Theusageofcontextualinformationtoassistindataminingisofeverincreasingimportance.Suchcontextualinformationcouldincludemetainformationaboutsubjectssuchasweight,height,age,sex,his-toryofvitalsigns,aswellasthehistoryofpreviousdecisions.Itisalsopos-


sible to automate the retrieval of high-level information via available ontolo-gies [17,132] and link this information to the data.

Discovering of unseen features: Still, important features of the data whichmay be unintuitive, e.g., frequency domain features may be needed for provid-ing proper analysis and uncovering essential characteristics from the data whichcannot be obtained by hand-engineered features. It is also worth noting that inreal-world system such as home monitoring, it would be difficult to model theunexpected features with straightforward techniques.

Post-processing and representation: As a result of healthcare systems, anupcoming approach could use classical data mining techniques together withmethods such as natural language generation which uncover trends in the databut also explain the process to both expert and non-expert users. Works suchas [29,119] have demonstrated the possible uses of such systems in both clinicaland experimental contexts.

In sum, this chapter has provided an overview of the current trends andchallenges of mining physiological sensor data within health monitoring sys-tems. The chapter investigated 1) the main data mining tasks in health moni-toring systems, 2) the dominant data mining approaches that have been used,and 3) various data types and their properties.

In relation to the rest of this thesis, this chapter has shown the need of con-sidering new data analysis approaches to extract valuable information from thedata beyond expert knowledge. This need of data-driven mining of sensor datais then performed in chapters 7 and 8, as the first step within the framework ofdescribing physiological sensor data.


sibletoautomatetheretrievalofhigh-levelinformationviaavailableontolo-gies[17,132]andlinkthisinformationtothedata.

Discoveringofunseenfeatures:Still,importantfeaturesofthedatawhichmaybeunintuitive,e.g.,frequencydomainfeaturesmaybeneededforprovid-ingproperanalysisanduncoveringessentialcharacteristicsfromthedatawhichcannotbeobtainedbyhand-engineeredfeatures.Itisalsoworthnotingthatinreal-worldsystemsuchashomemonitoring,itwouldbedifficulttomodeltheunexpectedfeatureswithstraightforwardtechniques.

Post-processingandrepresentation:Asaresultofhealthcaresystems,anupcomingapproachcoulduseclassicaldataminingtechniquestogetherwithmethodssuchasnaturallanguagegenerationwhichuncovertrendsinthedatabutalsoexplaintheprocesstobothexpertandnon-expertusers.Workssuchas[29,119]havedemonstratedthepossibleusesofsuchsystemsinbothclinicalandexperimentalcontexts.

Insum,thischapterhasprovidedanoverviewofthecurrenttrendsandchallengesofminingphysiologicalsensordatawithinhealthmonitoringsys-tems.Thechapterinvestigated1)themaindataminingtasksinhealthmoni-toringsystems,2)thedominantdataminingapproachesthathavebeenused,and3)variousdatatypesandtheirproperties.

Inrelationtotherestofthisthesis,thischapterhasshowntheneedofcon-sideringnewdataanalysisapproachestoextractvaluableinformationfromthedatabeyondexpertknowledge.Thisneedofdata-drivenminingofsensordataisthenperformedinchapters7and8,asthefirststepwithintheframeworkofdescribingphysiologicalsensordata.


sible to automate the retrieval of high-level information via available ontolo-gies [17,132] and link this information to the data.

Discovering of unseen features: Still, important features of the data whichmay be unintuitive, e.g., frequency domain features may be needed for provid-ing proper analysis and uncovering essential characteristics from the data whichcannot be obtained by hand-engineered features. It is also worth noting that inreal-world system such as home monitoring, it would be difficult to model theunexpected features with straightforward techniques.

Post-processing and representation: As a result of healthcare systems, anupcoming approach could use classical data mining techniques together withmethods such as natural language generation which uncover trends in the databut also explain the process to both expert and non-expert users. Works suchas [29,119] have demonstrated the possible uses of such systems in both clinicaland experimental contexts.

In sum, this chapter has provided an overview of the current trends andchallenges of mining physiological sensor data within health monitoring sys-tems. The chapter investigated 1) the main data mining tasks in health moni-toring systems, 2) the dominant data mining approaches that have been used,and 3) various data types and their properties.

In relation to the rest of this thesis, this chapter has shown the need of con-sidering new data analysis approaches to extract valuable information from thedata beyond expert knowledge. This need of data-driven mining of sensor datais then performed in chapters 7 and 8, as the first step within the framework ofdescribing physiological sensor data.


sibletoautomatetheretrievalofhigh-levelinformationviaavailableontolo-gies[17,132]andlinkthisinformationtothedata.

Discoveringofunseenfeatures:Still,importantfeaturesofthedatawhichmaybeunintuitive,e.g.,frequencydomainfeaturesmaybeneededforprovid-ingproperanalysisanduncoveringessentialcharacteristicsfromthedatawhichcannotbeobtainedbyhand-engineeredfeatures.Itisalsoworthnotingthatinreal-worldsystemsuchashomemonitoring,itwouldbedifficulttomodeltheunexpectedfeatureswithstraightforwardtechniques.

Post-processingandrepresentation:Asaresultofhealthcaresystems,anupcomingapproachcoulduseclassicaldataminingtechniquestogetherwithmethodssuchasnaturallanguagegenerationwhichuncovertrendsinthedatabutalsoexplaintheprocesstobothexpertandnon-expertusers.Workssuchas[29,119]havedemonstratedthepossibleusesofsuchsystemsinbothclinicalandexperimentalcontexts.

Insum,thischapterhasprovidedanoverviewofthecurrenttrendsandchallengesofminingphysiologicalsensordatawithinhealthmonitoringsys-tems.Thechapterinvestigated1)themaindataminingtasksinhealthmoni-toringsystems,2)thedominantdataminingapproachesthathavebeenused,and3)variousdatatypesandtheirproperties.

Inrelationtotherestofthisthesis,thischapterhasshowntheneedofcon-sideringnewdataanalysisapproachestoextractvaluableinformationfromthedatabeyondexpertknowledge.Thisneedofdata-drivenminingofsensordataisthenperformedinchapters7and8,asthefirststepwithintheframeworkofdescribingphysiologicalsensordata.

Chapter 7

Physiological Time Series Data:

Preparation and Processing

“We are drowning in information but starved forknowledge.”

— John Naisbitt (1929–)

This chapter presents the first steps of data analysis to process the raw dataof physiological sensors and exploit a set of meaningful and interesting

information. This processing includes data collection/acquisition and miningin order to extract trends and patterns. This kind of information is then usedfor the tasks of creating semantic representations and linguistic descriptions forphysiological sensor data (chapter 9).

With the increase of wearable sensor technology in both clinical and athome settings, the accumulation of physiological sensor data requires a con-centrated effort on the analysis and modelling of this data [58]. Via sensordata analysis and modelling, it is possible to achieve a deeper understandingof the correlations between long-term measurements of physiological parame-ters and medical conditions. Typically, this process involves diverse data min-ing techniques on sensor data to acquire patient-specific models [28, 213]. Ingeneral, such approaches are either knowledge-driven or data-driven. Using aknowledge-driven approach leads to a supervised model of information extrac-tion, but information is restricted to expert domain knowledge [235]. On theother hand, data-driven methods enable a system to discover hidden and po-tentially useful information through the physiological sensor data and to buildmodels based on the experimental data [28]. In order to leverage from data-driven approaches, a solution whereby hidden patterns can be captured andmade explicit in human consumable terms, i.e., semantics, is beneficial. Suchan approach would not only facilitate automatic monitoring but contribute to

105

Chapter7

PhysiologicalTimeSeriesData:

PreparationandProcessing

“Wearedrowningininformationbutstarvedforknowledge.”

—JohnNaisbitt(1929–)

Thischapterpresentsthefirststepsofdataanalysistoprocesstherawdataofphysiologicalsensorsandexploitasetofmeaningfulandinteresting

information.Thisprocessingincludesdatacollection/acquisitionandmininginordertoextracttrendsandpatterns.Thiskindofinformationisthenusedforthetasksofcreatingsemanticrepresentationsandlinguisticdescriptionsforphysiologicalsensordata(chapter9).

Withtheincreaseofwearablesensortechnologyinbothclinicalandathomesettings,theaccumulationofphysiologicalsensordatarequiresacon-centratedeffortontheanalysisandmodellingofthisdata[58].Viasensordataanalysisandmodelling,itispossibletoachieveadeeperunderstandingofthecorrelationsbetweenlong-termmeasurementsofphysiologicalparame-tersandmedicalconditions.Typically,thisprocessinvolvesdiversedatamin-ingtechniquesonsensordatatoacquirepatient-specificmodels[28,213].Ingeneral,suchapproachesareeitherknowledge-drivenordata-driven.Usingaknowledge-drivenapproachleadstoasupervisedmodelofinformationextrac-tion,butinformationisrestrictedtoexpertdomainknowledge[235].Ontheotherhand,data-drivenmethodsenableasystemtodiscoverhiddenandpo-tentiallyusefulinformationthroughthephysiologicalsensordataandtobuildmodelsbasedontheexperimentaldata[28].Inordertoleveragefromdata-drivenapproaches,asolutionwherebyhiddenpatternscanbecapturedandmadeexplicitinhumanconsumableterms,i.e.,semantics,isbeneficial.Suchanapproachwouldnotonlyfacilitateautomaticmonitoringbutcontributeto

105

Chapter 7

Physiological Time Series Data:

Preparation and Processing

“We are drowning in information but starved forknowledge.”

— John Naisbitt (1929–)

This chapter presents the first steps of data analysis to process the raw dataof physiological sensors and exploit a set of meaningful and interesting

information. This processing includes data collection/acquisition and miningin order to extract trends and patterns. This kind of information is then usedfor the tasks of creating semantic representations and linguistic descriptions forphysiological sensor data (chapter 9).

With the increase of wearable sensor technology in both clinical and athome settings, the accumulation of physiological sensor data requires a con-centrated effort on the analysis and modelling of this data [58]. Via sensordata analysis and modelling, it is possible to achieve a deeper understandingof the correlations between long-term measurements of physiological parame-ters and medical conditions. Typically, this process involves diverse data min-ing techniques on sensor data to acquire patient-specific models [28, 213]. Ingeneral, such approaches are either knowledge-driven or data-driven. Using aknowledge-driven approach leads to a supervised model of information extrac-tion, but information is restricted to expert domain knowledge [235]. On theother hand, data-driven methods enable a system to discover hidden and po-tentially useful information through the physiological sensor data and to buildmodels based on the experimental data [28]. In order to leverage from data-driven approaches, a solution whereby hidden patterns can be captured andmade explicit in human consumable terms, i.e., semantics, is beneficial. Suchan approach would not only facilitate automatic monitoring but contribute to

105

Chapter7

PhysiologicalTimeSeriesData:

PreparationandProcessing

“Wearedrowningininformationbutstarvedforknowledge.”

—JohnNaisbitt(1929–)

Thischapterpresentsthefirststepsofdataanalysistoprocesstherawdataofphysiologicalsensorsandexploitasetofmeaningfulandinteresting

information.Thisprocessingincludesdatacollection/acquisitionandmininginordertoextracttrendsandpatterns.Thiskindofinformationisthenusedforthetasksofcreatingsemanticrepresentationsandlinguisticdescriptionsforphysiologicalsensordata(chapter9).

Withtheincreaseofwearablesensortechnologyinbothclinicalandathomesettings,theaccumulationofphysiologicalsensordatarequiresacon-centratedeffortontheanalysisandmodellingofthisdata[58].Viasensordataanalysisandmodelling,itispossibletoachieveadeeperunderstandingofthecorrelationsbetweenlong-termmeasurementsofphysiologicalparame-tersandmedicalconditions.Typically,thisprocessinvolvesdiversedatamin-ingtechniquesonsensordatatoacquirepatient-specificmodels[28,213].Ingeneral,suchapproachesareeitherknowledge-drivenordata-driven.Usingaknowledge-drivenapproachleadstoasupervisedmodelofinformationextrac-tion,butinformationisrestrictedtoexpertdomainknowledge[235].Ontheotherhand,data-drivenmethodsenableasystemtodiscoverhiddenandpo-tentiallyusefulinformationthroughthephysiologicalsensordataandtobuildmodelsbasedontheexperimentaldata[28].Inordertoleveragefromdata-drivenapproaches,asolutionwherebyhiddenpatternscanbecapturedandmadeexplicitinhumanconsumableterms,i.e.,semantics,isbeneficial.Suchanapproachwouldnotonlyfacilitateautomaticmonitoringbutcontributeto

105

106 CHAPTER 7. PHYSIO. DATA: PREPARATION & PROCESSING

Figure 7.1: The wearable sensor, Bioharness3 [5], worn on the chest is able tolocally store the measured data or wirelessly transmit it via Bluetooth.

a deepening and betterment of our knowledge in understanding the relationshipbetween specific ailments and large-scale physiological time series and continu-ous data. This chapter aims to present some of data mining models to find suchpatterns in physiological sensor measurements.

7.1 Input Time Series Sensor Data: Collection and

Acquisition

This section describes the process of collecting and acquiring sensor data innon-clinical and clinical conditions. In this thesis, the collected non-clinical datais used for the trend detection and trend descriptions. The acquired clinical datatough is used wider for pattern abstraction, and later on in chapters 8 and 9for linguistic description of such patterns.

7.1.1 Wearable Sensors, Non-clinical Data

The proposed approach here is designed to consider several continuous healthparameters which are collected by wearable sensors or clinical records of phys-iological data. In this work, a wearable sensing device called Bioharness3 [5] isused which records various vital signs of the body including heart rate, respira-tion rate, skin temperature, activity, and electrocardiogram (ECG).1 This sensoris worn on the chest and is able to locally store data or wirelessly transmittingit via Bluetooth (Figure 7.1). In this process of data analysis, the input data iscontinuous measurements which include physiological time series signals for aspecific period.

Within the non-clinical condition, focus is primarily given to health param-eters heart rate (HR) and respiration rate (RR), which are common vital signsin the health monitoring domain [28]. In this study, each health parameter is

1 This specific product is now available on the provider’s web page with another commercialname: Zephyr™ Strap [6].

106CHAPTER7.PHYSIO.DATA:PREPARATION&PROCESSING

Figure7.1:Thewearablesensor,Bioharness3[5],wornonthechestisabletolocallystorethemeasureddataorwirelesslytransmititviaBluetooth.

adeepeningandbettermentofourknowledgeinunderstandingtherelationshipbetweenspecificailmentsandlarge-scalephysiologicaltimeseriesandcontinu-ousdata.Thischapteraimstopresentsomeofdataminingmodelstofindsuchpatternsinphysiologicalsensormeasurements.

7.1InputTimeSeriesSensorData:Collectionand

Acquisition

Thissectiondescribestheprocessofcollectingandacquiringsensordatainnon-clinicalandclinicalconditions.Inthisthesis,thecollectednon-clinicaldataisusedforthetrenddetectionandtrenddescriptions.Theacquiredclinicaldatatoughisusedwiderforpatternabstraction,andlateroninchapters8and9forlinguisticdescriptionofsuchpatterns.

7.1.1WearableSensors,Non-clinicalData

Theproposedapproachhereisdesignedtoconsiderseveralcontinuoushealthparameterswhicharecollectedbywearablesensorsorclinicalrecordsofphys-iologicaldata.Inthiswork,awearablesensingdevicecalledBioharness3[5]isusedwhichrecordsvariousvitalsignsofthebodyincludingheartrate,respira-tionrate,skintemperature,activity,andelectrocardiogram(ECG).1ThissensoriswornonthechestandisabletolocallystoredataorwirelesslytransmittingitviaBluetooth(Figure7.1).Inthisprocessofdataanalysis,theinputdataiscontinuousmeasurementswhichincludephysiologicaltimeseriessignalsforaspecificperiod.

Withinthenon-clinicalcondition,focusisprimarilygiventohealthparam-etersheartrate(HR)andrespirationrate(RR),whicharecommonvitalsignsinthehealthmonitoringdomain[28].Inthisstudy,eachhealthparameteris

1Thisspecificproductisnowavailableontheprovider’swebpagewithanothercommercialname:Zephyr™Strap[6].


Figure 7.1: The wearable sensor, Bioharness3 [5], worn on the chest is able tolocally store the measured data or wirelessly transmit it via Bluetooth.

a deepening and betterment of our knowledge in understanding the relationshipbetween specific ailments and large-scale physiological time series and continu-ous data. This chapter aims to present some of data mining models to find suchpatterns in physiological sensor measurements.

7.1 Input Time Series Sensor Data: Collection and

Acquisition

This section describes the process of collecting and acquiring sensor data innon-clinical and clinical conditions. In this thesis, the collected non-clinical datais used for the trend detection and trend descriptions. The acquired clinical datatough is used wider for pattern abstraction, and later on in chapters 8 and 9for linguistic description of such patterns.

7.1.1 Wearable Sensors, Non-clinical Data

The proposed approach here is designed to consider several continuous healthparameters which are collected by wearable sensors or clinical records of phys-iological data. In this work, a wearable sensing device called Bioharness3 [5] isused which records various vital signs of the body including heart rate, respira-tion rate, skin temperature, activity, and electrocardiogram (ECG).1 This sensoris worn on the chest and is able to locally store data or wirelessly transmittingit via Bluetooth (Figure 7.1). In this process of data analysis, the input data iscontinuous measurements which include physiological time series signals for aspecific period.

Within the non-clinical condition, focus is primarily given to health param-eters heart rate (HR) and respiration rate (RR), which are common vital signsin the health monitoring domain [28]. In this study, each health parameter is

1 This specific product is now available on the provider’s web page with another commercialname: Zephyr™ Strap [6].


Figure7.1:Thewearablesensor,Bioharness3[5],wornonthechestisabletolocallystorethemeasureddataorwirelesslytransmititviaBluetooth.

adeepeningandbettermentofourknowledgeinunderstandingtherelationshipbetweenspecificailmentsandlarge-scalephysiologicaltimeseriesandcontinu-ousdata.Thischapteraimstopresentsomeofdataminingmodelstofindsuchpatternsinphysiologicalsensormeasurements.

7.1InputTimeSeriesSensorData:Collectionand

Acquisition

Thissectiondescribestheprocessofcollectingandacquiringsensordatainnon-clinicalandclinicalconditions.Inthisthesis,thecollectednon-clinicaldataisusedforthetrenddetectionandtrenddescriptions.Theacquiredclinicaldatatoughisusedwiderforpatternabstraction,andlateroninchapters8and9forlinguisticdescriptionofsuchpatterns.

7.1.1WearableSensors,Non-clinicalData

Theproposedapproachhereisdesignedtoconsiderseveralcontinuoushealthparameterswhicharecollectedbywearablesensorsorclinicalrecordsofphys-iologicaldata.Inthiswork,awearablesensingdevicecalledBioharness3[5]isusedwhichrecordsvariousvitalsignsofthebodyincludingheartrate,respira-tionrate,skintemperature,activity,andelectrocardiogram(ECG).1ThissensoriswornonthechestandisabletolocallystoredataorwirelesslytransmittingitviaBluetooth(Figure7.1).Inthisprocessofdataanalysis,theinputdataiscontinuousmeasurementswhichincludephysiologicaltimeseriessignalsforaspecificperiod.

Withinthenon-clinicalcondition,focusisprimarilygiventohealthparam-etersheartrate(HR)andrespirationrate(RR),whicharecommonvitalsignsinthehealthmonitoringdomain[28].Inthisstudy,eachhealthparameteris

1Thisspecificproductisnowavailableontheprovider’swebpagewithanothercommercialname:Zephyr™Strap[6].

7.1. INPUT TIME SERIES SENSOR DATA 107

20:30 21:45 23:00 00:15 01:30 02:45 04:00 05:15 06:30 07:45

0

20

40

60

80

100

120

140

160

180

200

220

HH:MM

bp

m

HR

RR

Watching TVExercising Walking

Walking Sleeping

Figure 7.2: An example of non-clinical measurements depicting thirteen hoursof heart rate (HR, top) and respiration rate (RR, down) during sequential ac-tivities. The unit bpm is used for both signals (beats per minute for heart rateand breaths per minute for respiration rate).

considered independently. An example of the input measurements is shown inFigure 7.2. The data in this figure has been recorded for thirteen continuoushours during the sequential activities such as exercising, walking, watching TV,and sleeping.

7.1.2 Clinical Physiological Data

A challenge in the evaluation is to find reliable data sets consisting of long-term measurements of physiological parameters (vital signs) where such sen-sor data is annotated with ground truth information about patients’ condi-tions. Although the proposed approach is applicable to a variety of settings(Intensive care unit or ICU, ambulatory, and at-home monitoring), establishedbenchmarks are more readily available from sensor data sets in a clinicalsetting. Thus, a data set of physiological sensor data from the online Phy-sioNet database is used [105]. In particular, a numeric data set within thisdatabase called MIMIC (multi parameter intelligent monitoring for intensivecare) database [3] in considered. This data set contains periodic numeric mea-surements of physiological variables, such as heart rate, blood pressure, respi-ration rate, and oxygen saturation, obtained from bedside ICU monitors [156].The entire database includes multiple recordings with various lengths of mea-surements (from 1 hour to 77 hours) which are acquired from 90 subjects (pa-

7.1.INPUTTIMESERIESSENSORDATA107

20:3021:4523:0000:1501:3002:4504:0005:1506:3007:45

0

20

40

60

80

100

120

140

160

180

200

220

HH:MM

bp

m

HR

RR

Watching TV ExercisingWalking

WalkingSleeping

Figure7.2:Anexampleofnon-clinicalmeasurementsdepictingthirteenhoursofheartrate(HR,top)andrespirationrate(RR,down)duringsequentialac-tivities.Theunitbpmisusedforbothsignals(beatsperminuteforheartrateandbreathsperminuteforrespirationrate).

consideredindependently.AnexampleoftheinputmeasurementsisshowninFigure7.2.Thedatainthisfigurehasbeenrecordedforthirteencontinuoushoursduringthesequentialactivitiessuchasexercising,walking,watchingTV,andsleeping.

7.1.2ClinicalPhysiologicalData

Achallengeintheevaluationistofindreliabledatasetsconsistingoflong-termmeasurementsofphysiologicalparameters(vitalsigns)wheresuchsen-sordataisannotatedwithgroundtruthinformationaboutpatients’condi-tions.Althoughtheproposedapproachisapplicabletoavarietyofsettings(IntensivecareunitorICU,ambulatory,andat-homemonitoring),establishedbenchmarksaremorereadilyavailablefromsensordatasetsinaclinicalsetting.Thus,adatasetofphysiologicalsensordatafromtheonlinePhy-sioNetdatabaseisused[105].Inparticular,anumericdatasetwithinthisdatabasecalledMIMIC(multiparameterintelligentmonitoringforintensivecare)database[3]inconsidered.Thisdatasetcontainsperiodicnumericmea-surementsofphysiologicalvariables,suchasheartrate,bloodpressure,respi-rationrate,andoxygensaturation,obtainedfrombedsideICUmonitors[156].Theentiredatabaseincludesmultiplerecordingswithvariouslengthsofmea-surements(from1hourto77hours)whichareacquiredfrom90subjects(pa-

7.1. INPUT TIME SERIES SENSOR DATA 107

20:30 21:45 23:00 00:15 01:30 02:45 04:00 05:15 06:30 07:45

0

20

40

60

80

100

120

140

160

180

200

220

HH:MM

bp

m

HR

RR

Watching TVExercising Walking

Walking Sleeping

Figure 7.2: An example of non-clinical measurements depicting thirteen hoursof heart rate (HR, top) and respiration rate (RR, down) during sequential ac-tivities. The unit bpm is used for both signals (beats per minute for heart rateand breaths per minute for respiration rate).

considered independently. An example of the input measurements is shown inFigure 7.2. The data in this figure has been recorded for thirteen continuoushours during the sequential activities such as exercising, walking, watching TV,and sleeping.

7.1.2 Clinical Physiological Data

A challenge in the evaluation is to find reliable data sets consisting of long-term measurements of physiological parameters (vital signs) where such sen-sor data is annotated with ground truth information about patients’ condi-tions. Although the proposed approach is applicable to a variety of settings(Intensive care unit or ICU, ambulatory, and at-home monitoring), establishedbenchmarks are more readily available from sensor data sets in a clinicalsetting. Thus, a data set of physiological sensor data from the online Phy-sioNet database is used [105]. In particular, a numeric data set within thisdatabase called MIMIC (multi parameter intelligent monitoring for intensivecare) database [3] in considered. This data set contains periodic numeric mea-surements of physiological variables, such as heart rate, blood pressure, respi-ration rate, and oxygen saturation, obtained from bedside ICU monitors [156].The entire database includes multiple recordings with various lengths of mea-surements (from 1 hour to 77 hours) which are acquired from 90 subjects (pa-

7.1.INPUTTIMESERIESSENSORDATA107

20:3021:4523:0000:1501:3002:4504:0005:1506:3007:45

0

20

40

60

80

100

120

140

160

180

200

220

HH:MM

bp

m

HR

RR

Watching TV ExercisingWalking

WalkingSleeping

Figure7.2:Anexampleofnon-clinicalmeasurementsdepictingthirteenhoursofheartrate(HR,top)andrespirationrate(RR,down)duringsequentialac-tivities.Theunitbpmisusedforbothsignals(beatsperminuteforheartrateandbreathsperminuteforrespirationrate).

consideredindependently.AnexampleoftheinputmeasurementsisshowninFigure7.2.Thedatainthisfigurehasbeenrecordedforthirteencontinuoushoursduringthesequentialactivitiessuchasexercising,walking,watchingTV,andsleeping.

7.1.2ClinicalPhysiologicalData

Achallengeintheevaluationistofindreliabledatasetsconsistingoflong-termmeasurementsofphysiologicalparameters(vitalsigns)wheresuchsen-sordataisannotatedwithgroundtruthinformationaboutpatients’condi-tions.Althoughtheproposedapproachisapplicabletoavarietyofsettings(IntensivecareunitorICU,ambulatory,andat-homemonitoring),establishedbenchmarksaremorereadilyavailablefromsensordatasetsinaclinicalsetting.Thus,adatasetofphysiologicalsensordatafromtheonlinePhy-sioNetdatabaseisused[105].Inparticular,anumericdatasetwithinthisdatabasecalledMIMIC(multiparameterintelligentmonitoringforintensivecare)database[3]inconsidered.Thisdatasetcontainsperiodicnumericmea-surementsofphysiologicalvariables,suchasheartrate,bloodpressure,respi-rationrate,andoxygensaturation,obtainedfrombedsideICUmonitors[156].Theentiredatabaseincludesmultiplerecordingswithvariouslengthsofmea-surements(from1hourto77hours)whichareacquiredfrom90subjects(pa-


Table 7.1: Clinical conditions and their subjects in mimic database, after re-moving unreliable measurements.

Clinicalconditions

No. ofsubjects(records)

No. ofmale/female

Age:[min,max],

average

Averagelength ofrecords

Resp. failure 10 (17) 7/3 [38,90], 67 32h25m

Bleed 2 (4) 1/1 [45,70], 57 44h45m

CHF 13 (17) 6/7 [54,92], 75 33h15m

Brain injury 2 (3) 1/1 [68,75], 70 21h30m

Sepsis 4 (5) 3/1 [27,88], 64 31h20m

MI 6 (8) 2/4 [63,80], 68 42h35m

Angina 2 (4) 1/1 [67,68], 67 41h10m

Post-op Valve 2 (5) 0/2 [49,67], 58 40h45m

Post-op CABG 3 (3) 1/2 [49,80], 66 40h20m

tients) with different ages and genders. The subjects in this database have beenmanually labelled different clinical categories related to their medical problems.In this work, the numeric records of the subjects from nine major clinical con-ditions have been selected to be analysed and modelled. The considered clinicalconditions include Respiratory failure, Bleed (loss of blood from the circula-tory system), CHF (chronic heart failure), Brain injury, Sepsis, MI (myocardialinfarction, i.e. heart attack), Angina, Post-op Valve (heart valve surgery), andPost-op CABG (coronary artery bypass grafting surgery). General properties ofthe nine clinical conditions and the information about the selected subjects arelisted in Table 7.1.

Here, the subjects with records consisting of at least 12 hours of continuousreadings are considered to facilitate the identification of patterns over longertime horizons. Three physiological variables have been chosen to be processed:heart rate (HR), means of blood pressure (BP), and respiration rate (RR). Eachmeasurement consists of long-term sequential data (i.e. time series) with a reso-lution of 1Hz. As an example, Figure 7.3 shows seven hours of sensor readingsfrom the raw sensor data of a patient suffering from the CHF condition forvariables HR, BP, and RR.

Working with the clinical measurements in the MIMIC database is challeng-ing, due to dealing with incomplete, sparse, non-uniform, and irregular rawdata [154]. Before analysing the records of the subjects, the measurements arecleaned in the following steps:

1. All records that include three mentioned physiological variables arepicked for analysis,


Table7.1:Clinicalconditionsandtheirsubjectsinmimicdatabase,afterre-movingunreliablemeasurements.

Clinicalconditions

No.ofsubjects(records)

No.ofmale/female

Age:[min,max],

average

Averagelengthofrecords

Resp.failure10(17)7/3[38,90],6732h25m

Bleed2(4)1/1[45,70],5744h45m

CHF13(17)6/7[54,92],7533h15m

Braininjury2(3)1/1[68,75],7021h30m

Sepsis4(5)3/1[27,88],6431h20m

MI6(8)2/4[63,80],6842h35m

Angina2(4)1/1[67,68],6741h10m

Post-opValve2(5)0/2[49,67],5840h45m

Post-opCABG3(3)1/2[49,80],6640h20m

tients)withdifferentagesandgenders.Thesubjectsinthisdatabasehavebeenmanuallylabelleddifferentclinicalcategoriesrelatedtotheirmedicalproblems.Inthiswork,thenumericrecordsofthesubjectsfromninemajorclinicalcon-ditionshavebeenselectedtobeanalysedandmodelled.TheconsideredclinicalconditionsincludeRespiratoryfailure,Bleed(lossofbloodfromthecircula-torysystem),CHF(chronicheartfailure),Braininjury,Sepsis,MI(myocardialinfarction,i.e.heartattack),Angina,Post-opValve(heartvalvesurgery),andPost-opCABG(coronaryarterybypassgraftingsurgery).GeneralpropertiesofthenineclinicalconditionsandtheinformationabouttheselectedsubjectsarelistedinTable7.1.

Here,thesubjectswithrecordsconsistingofatleast12hoursofcontinuousreadingsareconsideredtofacilitatetheidentificationofpatternsoverlongertimehorizons.Threephysiologicalvariableshavebeenchosentobeprocessed:heartrate(HR),meansofbloodpressure(BP),andrespirationrate(RR).Eachmeasurementconsistsoflong-termsequentialdata(i.e.timeseries)withareso-lutionof1Hz.Asanexample,Figure7.3showssevenhoursofsensorreadingsfromtherawsensordataofapatientsufferingfromtheCHFconditionforvariablesHR,BP,andRR.

WorkingwiththeclinicalmeasurementsintheMIMICdatabaseischalleng-ing,duetodealingwithincomplete,sparse,non-uniform,andirregularrawdata[154].Beforeanalysingtherecordsofthesubjects,themeasurementsarecleanedinthefollowingsteps:

1.Allrecordsthatincludethreementionedphysiologicalvariablesarepickedforanalysis,


Table 7.1: Clinical conditions and their subjects in mimic database, after re-moving unreliable measurements.

Clinicalconditions

No. ofsubjects(records)

No. ofmale/female

Age:[min,max],

average

Averagelength ofrecords

Resp. failure 10 (17) 7/3 [38,90], 67 32h25m

Bleed 2 (4) 1/1 [45,70], 57 44h45m

CHF 13 (17) 6/7 [54,92], 75 33h15m

Brain injury 2 (3) 1/1 [68,75], 70 21h30m

Sepsis 4 (5) 3/1 [27,88], 64 31h20m

MI 6 (8) 2/4 [63,80], 68 42h35m

Angina 2 (4) 1/1 [67,68], 67 41h10m

Post-op Valve 2 (5) 0/2 [49,67], 58 40h45m

Post-op CABG 3 (3) 1/2 [49,80], 66 40h20m

tients) with different ages and genders. The subjects in this database have beenmanually labelled different clinical categories related to their medical problems.In this work, the numeric records of the subjects from nine major clinical con-ditions have been selected to be analysed and modelled. The considered clinicalconditions include Respiratory failure, Bleed (loss of blood from the circula-tory system), CHF (chronic heart failure), Brain injury, Sepsis, MI (myocardialinfarction, i.e. heart attack), Angina, Post-op Valve (heart valve surgery), andPost-op CABG (coronary artery bypass grafting surgery). General properties ofthe nine clinical conditions and the information about the selected subjects arelisted in Table 7.1.

Here, the subjects with records consisting of at least 12 hours of continuousreadings are considered to facilitate the identification of patterns over longertime horizons. Three physiological variables have been chosen to be processed:heart rate (HR), means of blood pressure (BP), and respiration rate (RR). Eachmeasurement consists of long-term sequential data (i.e. time series) with a reso-lution of 1Hz. As an example, Figure 7.3 shows seven hours of sensor readingsfrom the raw sensor data of a patient suffering from the CHF condition forvariables HR, BP, and RR.

Working with the clinical measurements in the MIMIC database is challeng-ing, due to dealing with incomplete, sparse, non-uniform, and irregular rawdata [154]. Before analysing the records of the subjects, the measurements arecleaned in the following steps:

1. All records that include three mentioned physiological variables arepicked for analysis,


Table7.1:Clinicalconditionsandtheirsubjectsinmimicdatabase,afterre-movingunreliablemeasurements.

Clinicalconditions

No.ofsubjects(records)

No.ofmale/female

Age:[min,max],

average

Averagelengthofrecords

Resp.failure10(17)7/3[38,90],6732h25m

Bleed2(4)1/1[45,70],5744h45m

CHF13(17)6/7[54,92],7533h15m

Braininjury2(3)1/1[68,75],7021h30m

Sepsis4(5)3/1[27,88],6431h20m

MI6(8)2/4[63,80],6842h35m

Angina2(4)1/1[67,68],6741h10m

Post-opValve2(5)0/2[49,67],5840h45m

Post-opCABG3(3)1/2[49,80],6640h20m

tients)withdifferentagesandgenders.Thesubjectsinthisdatabasehavebeenmanuallylabelleddifferentclinicalcategoriesrelatedtotheirmedicalproblems.Inthiswork,thenumericrecordsofthesubjectsfromninemajorclinicalcon-ditionshavebeenselectedtobeanalysedandmodelled.TheconsideredclinicalconditionsincludeRespiratoryfailure,Bleed(lossofbloodfromthecircula-torysystem),CHF(chronicheartfailure),Braininjury,Sepsis,MI(myocardialinfarction,i.e.heartattack),Angina,Post-opValve(heartvalvesurgery),andPost-opCABG(coronaryarterybypassgraftingsurgery).GeneralpropertiesofthenineclinicalconditionsandtheinformationabouttheselectedsubjectsarelistedinTable7.1.

Here,thesubjectswithrecordsconsistingofatleast12hoursofcontinuousreadingsareconsideredtofacilitatetheidentificationofpatternsoverlongertimehorizons.Threephysiologicalvariableshavebeenchosentobeprocessed:heartrate(HR),meansofbloodpressure(BP),andrespirationrate(RR).Eachmeasurementconsistsoflong-termsequentialdata(i.e.timeseries)withareso-lutionof1Hz.Asanexample,Figure7.3showssevenhoursofsensorreadingsfromtherawsensordataofapatientsufferingfromtheCHFconditionforvariablesHR,BP,andRR.

WorkingwiththeclinicalmeasurementsintheMIMICdatabaseischalleng-ing,duetodealingwithincomplete,sparse,non-uniform,andirregularrawdata[154].Beforeanalysingtherecordsofthesubjects,themeasurementsarecleanedinthefollowingsteps:

1.Allrecordsthatincludethreementionedphysiologicalvariablesarepickedforanalysis,

7.2. TREND DETECTION IN PHYSIOLOGICAL TIME SERIES 109

1 2 3 4 5 6

0

20

40

60

80

100

120

hour

RR (breaths/min) BP (mmHg) HR (beats/min)

Figure 7.3: An example of clinical sensor data from MIMIC data set with vari-ables heart rate (HR), blood pressure (BP), and respiration rate (RR).

2. Measurements with short episodes (less than 12 hours) were discarded,because finding meaningful patterns in a short period of data is not rea-sonable,

3. Since the data is collected in a clinical environment with wearable sen-sors, the signal readings involve plenty of artefacts and noise. To avoidprocessing incorrect information

(a) Sensor readings with unreliable values (e.g. zero value for heart rate)are discarded2,

(b) A local regression method (LOESS) as a smoothing function [99] isapplied on readings to reduce the amount of noisy data.

After cleaning the data set, 45 subjects in nine clinical conditions are chosento be analysed, which include reliable measurements with all three variables.

7.2 Partial Trend Detection in Physiological Time

Series Data

Mining of physiological time series data is significant not only to model but alsoto detect specific health-related vital signs. One of the main challenges in thehealthcare area is how to analyse physiological data such that valuable infor-mation can help the end user (physician or layman). The data analysis moduleaims to detect and represent the principal events and significant trends whichare relevant for the end user. The proposed data processing method is unsu-pervised i.e., without expert knowledge or pre-defined rules, and can discoverinformation which is not necessarily recognisable by an expert at first glance.

2The ranges of accepted values for three health parameters are: 20 < HR < 200, 30 < BP <220, and 5 < RR < 100.

7.2.TRENDDETECTIONINPHYSIOLOGICALTIMESERIES109

123456

0

20

40

60

80

100

120

hour

RR (breaths/min)BP (mmHg)HR (beats/min)

Figure7.3:AnexampleofclinicalsensordatafromMIMICdatasetwithvari-ablesheartrate(HR),bloodpressure(BP),andrespirationrate(RR).

2.Measurementswithshortepisodes(lessthan12hours)werediscarded,becausefindingmeaningfulpatternsinashortperiodofdataisnotrea-sonable,

3.Sincethedataiscollectedinaclinicalenvironmentwithwearablesen-sors,thesignalreadingsinvolveplentyofartefactsandnoise.Toavoidprocessingincorrectinformation

(a)Sensorreadingswithunreliablevalues(e.g.zerovalueforheartrate)arediscarded2,

(b)Alocalregressionmethod(LOESS)asasmoothingfunction[99]isappliedonreadingstoreducetheamountofnoisydata.

Aftercleaningthedataset,45subjectsinnineclinicalconditionsarechosentobeanalysed,whichincludereliablemeasurementswithallthreevariables.

7.2PartialTrendDetectioninPhysiologicalTime

SeriesData

Miningofphysiologicaltimeseriesdataissignificantnotonlytomodelbutalsotodetectspecifichealth-relatedvitalsigns.Oneofthemainchallengesinthehealthcareareaishowtoanalysephysiologicaldatasuchthatvaluableinfor-mationcanhelptheenduser(physicianorlayman).Thedataanalysismoduleaimstodetectandrepresenttheprincipaleventsandsignificanttrendswhicharerelevantfortheenduser.Theproposeddataprocessingmethodisunsu-pervisedi.e.,withoutexpertknowledgeorpre-definedrules,andcandiscoverinformationwhichisnotnecessarilyrecognisablebyanexpertatfirstglance.

2Therangesofacceptedvaluesforthreehealthparametersare:20<HR<200,30<BP<220,and5<RR<100.


1 2 3 4 5 6

0

20

40

60

80

100

120

hour

RR (breaths/min) BP (mmHg) HR (beats/min)

Figure 7.3: An example of clinical sensor data from MIMIC data set with vari-ables heart rate (HR), blood pressure (BP), and respiration rate (RR).

2. Measurements with short episodes (less than 12 hours) were discarded,because finding meaningful patterns in a short period of data is not rea-sonable,

3. Since the data is collected in a clinical environment with wearable sen-sors, the signal readings involve plenty of artefacts and noise. To avoidprocessing incorrect information

(a) Sensor readings with unreliable values (e.g. zero value for heart rate)are discarded2,

(b) A local regression method (LOESS) as a smoothing function [99] isapplied on readings to reduce the amount of noisy data.

After cleaning the data set, 45 subjects in nine clinical conditions are chosento be analysed, which include reliable measurements with all three variables.

7.2 Partial Trend Detection in Physiological Time

Series Data

Mining of physiological time series data is significant not only to model but alsoto detect specific health-related vital signs. One of the main challenges in thehealthcare area is how to analyse physiological data such that valuable infor-mation can help the end user (physician or layman). The data analysis moduleaims to detect and represent the principal events and significant trends whichare relevant for the end user. The proposed data processing method is unsu-pervised i.e., without expert knowledge or pre-defined rules, and can discoverinformation which is not necessarily recognisable by an expert at first glance.

2The ranges of accepted values for three health parameters are: 20 < HR < 200, 30 < BP <220, and 5 < RR < 100.


123456

0

20

40

60

80

100

120

hour

RR (breaths/min)BP (mmHg)HR (beats/min)

Figure7.3:AnexampleofclinicalsensordatafromMIMICdatasetwithvari-ablesheartrate(HR),bloodpressure(BP),andrespirationrate(RR).

2.Measurementswithshortepisodes(lessthan12hours)werediscarded,becausefindingmeaningfulpatternsinashortperiodofdataisnotrea-sonable,

3.Sincethedataiscollectedinaclinicalenvironmentwithwearablesen-sors,thesignalreadingsinvolveplentyofartefactsandnoise.Toavoidprocessingincorrectinformation

(a)Sensorreadingswithunreliablevalues(e.g.zerovalueforheartrate)arediscarded2,

(b)Alocalregressionmethod(LOESS)asasmoothingfunction[99]isappliedonreadingstoreducetheamountofnoisydata.

Aftercleaningthedataset,45subjectsinnineclinicalconditionsarechosentobeanalysed,whichincludereliablemeasurementswithallthreevariables.

7.2PartialTrendDetectioninPhysiologicalTime

SeriesData

Miningofphysiologicaltimeseriesdataissignificantnotonlytomodelbutalsotodetectspecifichealth-relatedvitalsigns.Oneofthemainchallengesinthehealthcareareaishowtoanalysephysiologicaldatasuchthatvaluableinfor-mationcanhelptheenduser(physicianorlayman).Thedataanalysismoduleaimstodetectandrepresenttheprincipaleventsandsignificanttrendswhicharerelevantfortheenduser.Theproposeddataprocessingmethodisunsu-pervisedi.e.,withoutexpertknowledgeorpre-definedrules,andcandiscoverinformationwhichisnotnecessarilyrecognisablebyanexpertatfirstglance.

2Therangesofacceptedvaluesforthreehealthparametersare:20<HR<200,30<BP<220,and5<RR<100.


The data analysis module includes preprocessing and segmentation steps whichhelp the system to perform statistical information and trend detection compo-nents.

Preprocessing

In comparison with clinical data, noise and artefacts (i.e., disturbances or ab-normalities within the signals) are more predominant in the signal from wear-able sensors. Thus, the preprocessing of signals is a necessary task. In this work,artefacts are eliminated heuristically by setting some thresholds for each healthparameter. Then, the local regression method (LOESS) has been applied to re-duce the noise in signals. This method is a non-parametric regression methodwhich is commonly used as smoothing function [149]. Figure 7.4b shows anexample of the LOESS smoothing model for heart rate and respiration ratefor the raw data presented in Figure 7.4a. The bandwidth parameter in thismethod is adapted depending on the requirements of the output for resolutionof information. The output of this step is a prepared time series data for furtheranalysis.

Signal Segmentation

After the smoothed signals are generated, a representation for each time seriesthat captures temporal changes in the data is generated. Several methods havebeen introduced such as Fourier and Wavelet transforms, Symbolic Mappings,and Piecewise Linear models etc. [148] to represent the primary apparent at-tributes of the signals. Here, piecewise linear approximation (PLA) [130] as asegmentation method is selected for this system which can make a significantrepresentation of the time series simply and efficiently. The output of the PLAmethod on a time series with length n is a set of linear segments with sizem (m � n). The most popular approach to calculate the PLA is Bottom-upmethod. This approach starts with n/2 segments and merges the two next seg-ments which have minimum distance error after merging. This process repeatstill some stopping criteria are satisfied. The criteria could be setting a thresholdon the maximum distance error and on the number of segments.

There are several methods to find the optimal number of segments [82]. Inthis work, the threshold parameter is heuristically tuned based on the requestedresolution of the output trends.

Figure 7.4c and Figure 7.4d show the examples of output of PLA method forthe smoothed heart rate and respiration rate time series presented in Figure 7.4bwith m=10 and m=25, respectively. Based on the request from the end user, itis possible to provide all the details of events during the measurements or justreporting the major trends and changes with tuning the number of segments.


Thedataanalysismoduleincludespreprocessingandsegmentationstepswhichhelpthesystemtoperformstatisticalinformationandtrenddetectioncompo-nents.

Preprocessing

Incomparisonwithclinicaldata,noiseandartefacts(i.e.,disturbancesorab-normalitieswithinthesignals)aremorepredominantinthesignalfromwear-ablesensors.Thus,thepreprocessingofsignalsisanecessarytask.Inthiswork,artefactsareeliminatedheuristicallybysettingsomethresholdsforeachhealthparameter.Then,thelocalregressionmethod(LOESS)hasbeenappliedtore-ducethenoiseinsignals.Thismethodisanon-parametricregressionmethodwhichiscommonlyusedassmoothingfunction[149].Figure7.4bshowsanexampleoftheLOESSsmoothingmodelforheartrateandrespirationratefortherawdatapresentedinFigure7.4a.Thebandwidthparameterinthismethodisadapteddependingontherequirementsoftheoutputforresolutionofinformation.Theoutputofthisstepisapreparedtimeseriesdataforfurtheranalysis.

SignalSegmentation

Afterthesmoothedsignalsaregenerated,arepresentationforeachtimeseriesthatcapturestemporalchangesinthedataisgenerated.SeveralmethodshavebeenintroducedsuchasFourierandWavelettransforms,SymbolicMappings,andPiecewiseLinearmodelsetc.[148]torepresenttheprimaryapparentat-tributesofthesignals.Here,piecewiselinearapproximation(PLA)[130]asasegmentationmethodisselectedforthissystemwhichcanmakeasignificantrepresentationofthetimeseriessimplyandefficiently.TheoutputofthePLAmethodonatimeserieswithlengthnisasetoflinearsegmentswithsizem(m�n).ThemostpopularapproachtocalculatethePLAisBottom-upmethod.Thisapproachstartswithn/2segmentsandmergesthetwonextseg-mentswhichhaveminimumdistanceerroraftermerging.Thisprocessrepeatstillsomestoppingcriteriaaresatisfied.Thecriteriacouldbesettingathresholdonthemaximumdistanceerrorandonthenumberofsegments.

Thereareseveralmethodstofindtheoptimalnumberofsegments[82].Inthiswork,thethresholdparameterisheuristicallytunedbasedontherequestedresolutionoftheoutputtrends.

Figure7.4candFigure7.4dshowtheexamplesofoutputofPLAmethodforthesmoothedheartrateandrespirationratetimeseriespresentedinFigure7.4bwithm=10andm=25,respectively.Basedontherequestfromtheenduser,itispossibletoprovideallthedetailsofeventsduringthemeasurementsorjustreportingthemajortrendsandchangeswithtuningthenumberofsegments.


The data analysis module includes preprocessing and segmentation steps whichhelp the system to perform statistical information and trend detection compo-nents.

Preprocessing

In comparison with clinical data, noise and artefacts (i.e., disturbances or ab-normalities within the signals) are more predominant in the signal from wear-able sensors. Thus, the preprocessing of signals is a necessary task. In this work,artefacts are eliminated heuristically by setting some thresholds for each healthparameter. Then, the local regression method (LOESS) has been applied to re-duce the noise in signals. This method is a non-parametric regression methodwhich is commonly used as smoothing function [149]. Figure 7.4b shows anexample of the LOESS smoothing model for heart rate and respiration ratefor the raw data presented in Figure 7.4a. The bandwidth parameter in thismethod is adapted depending on the requirements of the output for resolutionof information. The output of this step is a prepared time series data for furtheranalysis.

Signal Segmentation

After the smoothed signals are generated, a representation for each time seriesthat captures temporal changes in the data is generated. Several methods havebeen introduced such as Fourier and Wavelet transforms, Symbolic Mappings,and Piecewise Linear models etc. [148] to represent the primary apparent at-tributes of the signals. Here, piecewise linear approximation (PLA) [130] as asegmentation method is selected for this system which can make a significantrepresentation of the time series simply and efficiently. The output of the PLAmethod on a time series with length n is a set of linear segments with sizem (m � n). The most popular approach to calculate the PLA is Bottom-upmethod. This approach starts with n/2 segments and merges the two next seg-ments which have minimum distance error after merging. This process repeatstill some stopping criteria are satisfied. The criteria could be setting a thresholdon the maximum distance error and on the number of segments.

There are several methods to find the optimal number of segments [82]. Inthis work, the threshold parameter is heuristically tuned based on the requestedresolution of the output trends.

Figure 7.4c and Figure 7.4d show the examples of output of PLA method forthe smoothed heart rate and respiration rate time series presented in Figure 7.4bwith m=10 and m=25, respectively. Based on the request from the end user, itis possible to provide all the details of events during the measurements or justreporting the major trends and changes with tuning the number of segments.


Thedataanalysismoduleincludespreprocessingandsegmentationstepswhichhelpthesystemtoperformstatisticalinformationandtrenddetectioncompo-nents.

Preprocessing

Incomparisonwithclinicaldata,noiseandartefacts(i.e.,disturbancesorab-normalitieswithinthesignals)aremorepredominantinthesignalfromwear-ablesensors.Thus,thepreprocessingofsignalsisanecessarytask.Inthiswork,artefactsareeliminatedheuristicallybysettingsomethresholdsforeachhealthparameter.Then,thelocalregressionmethod(LOESS)hasbeenappliedtore-ducethenoiseinsignals.Thismethodisanon-parametricregressionmethodwhichiscommonlyusedassmoothingfunction[149].Figure7.4bshowsanexampleoftheLOESSsmoothingmodelforheartrateandrespirationratefortherawdatapresentedinFigure7.4a.Thebandwidthparameterinthismethodisadapteddependingontherequirementsoftheoutputforresolutionofinformation.Theoutputofthisstepisapreparedtimeseriesdataforfurtheranalysis.

SignalSegmentation

Afterthesmoothedsignalsaregenerated,arepresentationforeachtimeseriesthatcapturestemporalchangesinthedataisgenerated.SeveralmethodshavebeenintroducedsuchasFourierandWavelettransforms,SymbolicMappings,andPiecewiseLinearmodelsetc.[148]torepresenttheprimaryapparentat-tributesofthesignals.Here,piecewiselinearapproximation(PLA)[130]asasegmentationmethodisselectedforthissystemwhichcanmakeasignificantrepresentationofthetimeseriessimplyandefficiently.TheoutputofthePLAmethodonatimeserieswithlengthnisasetoflinearsegmentswithsizem(m�n).ThemostpopularapproachtocalculatethePLAisBottom-upmethod.Thisapproachstartswithn/2segmentsandmergesthetwonextseg-mentswhichhaveminimumdistanceerroraftermerging.Thisprocessrepeatstillsomestoppingcriteriaaresatisfied.Thecriteriacouldbesettingathresholdonthemaximumdistanceerrorandonthenumberofsegments.

Thereareseveralmethodstofindtheoptimalnumberofsegments[82].Inthiswork,thethresholdparameterisheuristicallytunedbasedontherequestedresolutionoftheoutputtrends.

Figure7.4candFigure7.4dshowtheexamplesofoutputofPLAmethodforthesmoothedheartrateandrespirationratetimeseriespresentedinFigure7.4bwithm=10andm=25,respectively.Basedontherequestfromtheenduser,itispossibletoprovideallthedetailsofeventsduringthemeasurementsorjustreportingthemajortrendsandchangeswithtuningthenumberofsegments.


0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

160

time

bp

m(a) Raw data

0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(b) Smoothness

0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m

(d) Segmentation, m=10

0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m

(c) Segmentation, m=25

HR (beat per minute) RR (breath per minute)

Figure 7.4: An example of physiological time series data preprocessing andsegmentation: (a) the raw data of heart rate (HR, top) and respiration rate(RR, down) for 22 hours, (b) an instance of smoothing raw signals usingLOESS method, (c) and (d) examples of the segmentation method (PLA) ofthe smoothed time series in Figure b with m=25 and m=10, respectively.

Partial Trends

Trends are the essential features to detect in physiological time series data asit can provide indications at an early stage of potential health issues and fa-cilitate prevention. The acquired segments from the PLA approach are used todetect trends by applying a trend detection algorithm. The aim of this methodis finding specific trends such as dropping, rising, or unstable changes in themeasurements. Several studies have previously examined partial trends on timeseries segmentation [45, 124]. However, the challenge here is to determinewhich number of segments corresponds to significant events in the data. Inother words, if the number of generated segments is high and each correspondsto a human understandable event, then the approach will produce too many


01234567

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(a) Raw data

01234567

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(b) Smoothness

01234567

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


01234567

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


HR (beat per minute)RR (breath per minute)

Figure7.4:Anexampleofphysiologicaltimeseriesdatapreprocessingandsegmentation:(a)therawdataofheartrate(HR,top)andrespirationrate(RR,down)for22hours,(b)aninstanceofsmoothingrawsignalsusingLOESSmethod,(c)and(d)examplesofthesegmentationmethod(PLA)ofthesmoothedtimeseriesinFigurebwithm=25andm=10,respectively.

PartialTrends

Trendsaretheessentialfeaturestodetectinphysiologicaltimeseriesdataasitcanprovideindicationsatanearlystageofpotentialhealthissuesandfa-cilitateprevention.TheacquiredsegmentsfromthePLAapproachareusedtodetecttrendsbyapplyingatrenddetectionalgorithm.Theaimofthismethodisfindingspecifictrendssuchasdropping,rising,orunstablechangesinthemeasurements.Severalstudieshavepreviouslyexaminedpartialtrendsontimeseriessegmentation[45,124].However,thechallengehereistodeterminewhichnumberofsegmentscorrespondstosignificanteventsinthedata.Inotherwords,ifthenumberofgeneratedsegmentsishighandeachcorrespondstoahumanunderstandableevent,thentheapproachwillproducetoomany


0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(a) Raw data

0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(b) Smoothness

0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


0 1 2 3 4 5 6 7

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


HR (beat per minute) RR (breath per minute)

Figure 7.4: An example of physiological time series data preprocessing andsegmentation: (a) the raw data of heart rate (HR, top) and respiration rate(RR, down) for 22 hours, (b) an instance of smoothing raw signals usingLOESS method, (c) and (d) examples of the segmentation method (PLA) ofthe smoothed time series in Figure b with m=25 and m=10, respectively.

Partial Trends

Trends are the essential features to detect in physiological time series data asit can provide indications at an early stage of potential health issues and fa-cilitate prevention. The acquired segments from the PLA approach are used todetect trends by applying a trend detection algorithm. The aim of this methodis finding specific trends such as dropping, rising, or unstable changes in themeasurements. Several studies have previously examined partial trends on timeseries segmentation [45, 124]. However, the challenge here is to determinewhich number of segments corresponds to significant events in the data. Inother words, if the number of generated segments is high and each correspondsto a human understandable event, then the approach will produce too many


01234567

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(a) Raw data

01234567

x 104

0

20

40

60

80

100

120

140

160

time

bp

m

(b) Smoothness

01234567

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


01234567

x 104

0

20

40

60

80

100

120

140

time (sec)

bp

m


HR (beat per minute)RR (breath per minute)

Figure7.4:Anexampleofphysiologicaltimeseriesdatapreprocessingandsegmentation:(a)therawdataofheartrate(HR,top)andrespirationrate(RR,down)for22hours,(b)aninstanceofsmoothingrawsignalsusingLOESSmethod,(c)and(d)examplesofthesegmentationmethod(PLA)ofthesmoothedtimeseriesinFigurebwithm=25andm=10,respectively.

PartialTrends

Trendsaretheessentialfeaturestodetectinphysiologicaltimeseriesdataasitcanprovideindicationsatanearlystageofpotentialhealthissuesandfa-cilitateprevention.TheacquiredsegmentsfromthePLAapproachareusedtodetecttrendsbyapplyingatrenddetectionalgorithm.Theaimofthismethodisfindingspecifictrendssuchasdropping,rising,orunstablechangesinthemeasurements.Severalstudieshavepreviouslyexaminedpartialtrendsontimeseriessegmentation[45,124].However,thechallengehereistodeterminewhichnumberofsegmentscorrespondstosignificanteventsinthedata.Inotherwords,ifthenumberofgeneratedsegmentsishighandeachcorrespondstoahumanunderstandableevent,thentheapproachwillproducetoomany


events and overburden the user. In contrast, if the number of segments is low,then valuable information about meaningful events could be lost.

To find a proper solution to represent the trends, a partial trend is consid-ered to be a subset of segments, which has a similar tendency as the segmentsand relates to the orientation of data. Each time series is therefore representedas a collection of partial trends. The preliminary step of the algorithm is tonormalise both axes of data by scaling them between 0 and 1. With this nor-malisation, the features of detected trends will be independent of the durationand range of the data (either long-term or short-term data).

The set of segments obtained from the previous step is considered as the in-put of the proposed trend detection method. After normalising, this methodcharacterises the main attributes of the segments, which are longitude andgradient. For each segment si, the length of segment (lensi

) and its gra-dient (gradsi

), the trend detection algorithm starts with a set of segments,S = {s1 . . . sm}. Based on the defined parameters, the algorithm decides forthe segment si: keep it and concatenate it with the current trend, keep it asthe first segment of a new trend, or ignore it. The following function has beendefined to make a balance between the features of si:

f(gradsi, lensi

) = (α− gradsi) − 1/(λ− lensi

)× k (7.1)

where the α and λ are the heuristic thresholds for gradient and length, respec-tively and k is a coefficient to adjust the dependency of features. In this function,if f(gradsi

, lensi) is more than zero then si is kept, otherwise it will be elim-

inated (except some conditions related to length of si and the gradients of itsadjacent). Algorithm 7.1 illustrates the trend detection method with showingin which cases the algorithm makes a new trend or merges the segments in thecurrent trend. Figure 7.5 presents an output of trend detection algorithm for thesegmented heart rate and respiration signals. The annotations on the detectedtrends will be described in Section 8.2.

Depending on the end user requirements, the proposed system supportsmulti-resolution processing of the input signal and is able to summarise bothlong and short-term measurements. Figure 7.6 shows the output of the trenddetection algorithm for two different resolutions of one measurement on heartrate. The first one is long-term data in 22 hours (Figure 7.6, top) and the secondone is short-term data in 4.5 hours (Figure 7.6, bottom). The algorithm has de-tected several partial trends in the second diagram within 4.5 hours. However,the same portion of data has been identified as a single partial trend within 20hours of time series data in the first diagram.


eventsandoverburdentheuser.Incontrast,ifthenumberofsegmentsislow,thenvaluableinformationaboutmeaningfuleventscouldbelost.

Tofindapropersolutiontorepresentthetrends,apartialtrendisconsid-eredtobeasubsetofsegments,whichhasasimilartendencyasthesegmentsandrelatestotheorientationofdata.Eachtimeseriesisthereforerepresentedasacollectionofpartialtrends.Thepreliminarystepofthealgorithmistonormalisebothaxesofdatabyscalingthembetween0and1.Withthisnor-malisation,thefeaturesofdetectedtrendswillbeindependentofthedurationandrangeofthedata(eitherlong-termorshort-termdata).

Thesetofsegmentsobtainedfromthepreviousstepisconsideredasthein-putoftheproposedtrenddetectionmethod.Afternormalising,thismethodcharacterisesthemainattributesofthesegments,whicharelongitudeandgradient.Foreachsegmentsi,thelengthofsegment(lensi)anditsgra-dient(gradsi),thetrenddetectionalgorithmstartswithasetofsegments,S={s1...sm}.Basedonthedefinedparameters,thealgorithmdecidesforthesegmentsi:keepitandconcatenateitwiththecurrenttrend,keepitasthefirstsegmentofanewtrend,orignoreit.Thefollowingfunctionhasbeendefinedtomakeabalancebetweenthefeaturesofsi:

f(gradsi,lensi)=(α−gradsi)−1/(λ−lensi)×k(7.1)

wheretheαandλaretheheuristicthresholdsforgradientandlength,respec-tivelyandkisacoefficienttoadjustthedependencyoffeatures.Inthisfunction,iff(gradsi,lensi)ismorethanzerothensiiskept,otherwiseitwillbeelim-inated(exceptsomeconditionsrelatedtolengthofsiandthegradientsofitsadjacent).Algorithm7.1illustratesthetrenddetectionmethodwithshowinginwhichcasesthealgorithmmakesanewtrendormergesthesegmentsinthecurrenttrend.Figure7.5presentsanoutputoftrenddetectionalgorithmforthesegmentedheartrateandrespirationsignals.TheannotationsonthedetectedtrendswillbedescribedinSection8.2.

Dependingontheenduserrequirements,theproposedsystemsupportsmulti-resolutionprocessingoftheinputsignalandisabletosummarisebothlongandshort-termmeasurements.Figure7.6showstheoutputofthetrenddetectionalgorithmfortwodifferentresolutionsofonemeasurementonheartrate.Thefirstoneislong-termdatain22hours(Figure7.6,top)andthesecondoneisshort-termdatain4.5hours(Figure7.6,bottom).Thealgorithmhasde-tectedseveralpartialtrendsintheseconddiagramwithin4.5hours.However,thesameportionofdatahasbeenidentifiedasasinglepartialtrendwithin20hoursoftimeseriesdatainthefirstdiagram.


events and overburden the user. In contrast, if the number of segments is low,then valuable information about meaningful events could be lost.

To find a proper solution to represent the trends, a partial trend is consid-ered to be a subset of segments, which has a similar tendency as the segmentsand relates to the orientation of data. Each time series is therefore representedas a collection of partial trends. The preliminary step of the algorithm is tonormalise both axes of data by scaling them between 0 and 1. With this nor-malisation, the features of detected trends will be independent of the durationand range of the data (either long-term or short-term data).

The set of segments obtained from the previous step is considered as the in-put of the proposed trend detection method. After normalising, this methodcharacterises the main attributes of the segments, which are longitude andgradient. For each segment si, the length of segment (lensi

) and its gra-dient (gradsi

), the trend detection algorithm starts with a set of segments,S = {s1 . . . sm}. Based on the defined parameters, the algorithm decides forthe segment si: keep it and concatenate it with the current trend, keep it asthe first segment of a new trend, or ignore it. The following function has beendefined to make a balance between the features of si:

f(gradsi, lensi

) = (α− gradsi) − 1/(λ− lensi

)× k (7.1)

where the α and λ are the heuristic thresholds for gradient and length, respec-tively and k is a coefficient to adjust the dependency of features. In this function,if f(gradsi

, lensi) is more than zero then si is kept, otherwise it will be elim-

inated (except some conditions related to length of si and the gradients of itsadjacent). Algorithm 7.1 illustrates the trend detection method with showingin which cases the algorithm makes a new trend or merges the segments in thecurrent trend. Figure 7.5 presents an output of trend detection algorithm for thesegmented heart rate and respiration signals. The annotations on the detectedtrends will be described in Section 8.2.

Depending on the end user requirements, the proposed system supportsmulti-resolution processing of the input signal and is able to summarise bothlong and short-term measurements. Figure 7.6 shows the output of the trenddetection algorithm for two different resolutions of one measurement on heartrate. The first one is long-term data in 22 hours (Figure 7.6, top) and the secondone is short-term data in 4.5 hours (Figure 7.6, bottom). The algorithm has de-tected several partial trends in the second diagram within 4.5 hours. However,the same portion of data has been identified as a single partial trend within 20hours of time series data in the first diagram.


eventsandoverburdentheuser.Incontrast,ifthenumberofsegmentsislow,thenvaluableinformationaboutmeaningfuleventscouldbelost.

Tofindapropersolutiontorepresentthetrends,apartialtrendisconsid-eredtobeasubsetofsegments,whichhasasimilartendencyasthesegmentsandrelatestotheorientationofdata.Eachtimeseriesisthereforerepresentedasacollectionofpartialtrends.Thepreliminarystepofthealgorithmistonormalisebothaxesofdatabyscalingthembetween0and1.Withthisnor-malisation,thefeaturesofdetectedtrendswillbeindependentofthedurationandrangeofthedata(eitherlong-termorshort-termdata).

Thesetofsegmentsobtainedfromthepreviousstepisconsideredasthein-putoftheproposedtrenddetectionmethod.Afternormalising,thismethodcharacterisesthemainattributesofthesegments,whicharelongitudeandgradient.Foreachsegmentsi,thelengthofsegment(lensi)anditsgra-dient(gradsi),thetrenddetectionalgorithmstartswithasetofsegments,S={s1...sm}.Basedonthedefinedparameters,thealgorithmdecidesforthesegmentsi:keepitandconcatenateitwiththecurrenttrend,keepitasthefirstsegmentofanewtrend,orignoreit.Thefollowingfunctionhasbeendefinedtomakeabalancebetweenthefeaturesofsi:

f(gradsi,lensi)=(α−gradsi)−1/(λ−lensi)×k(7.1)

wheretheαandλaretheheuristicthresholdsforgradientandlength,respec-tivelyandkisacoefficienttoadjustthedependencyoffeatures.Inthisfunction,iff(gradsi,lensi)ismorethanzerothensiiskept,otherwiseitwillbeelim-inated(exceptsomeconditionsrelatedtolengthofsiandthegradientsofitsadjacent).Algorithm7.1illustratesthetrenddetectionmethodwithshowinginwhichcasesthealgorithmmakesanewtrendormergesthesegmentsinthecurrenttrend.Figure7.5presentsanoutputoftrenddetectionalgorithmforthesegmentedheartrateandrespirationsignals.TheannotationsonthedetectedtrendswillbedescribedinSection8.2.

Dependingontheenduserrequirements,theproposedsystemsupportsmulti-resolutionprocessingoftheinputsignalandisabletosummarisebothlongandshort-termmeasurements.Figure7.6showstheoutputofthetrenddetectionalgorithmfortwodifferentresolutionsofonemeasurementonheartrate.Thefirstoneislong-termdatain22hours(Figure7.6,top)andthesecondoneisshort-termdatain4.5hours(Figure7.6,bottom).Thealgorithmhasde-tectedseveralpartialtrendsintheseconddiagramwithin4.5hours.However,thesameportionofdatahasbeenidentifiedasasinglepartialtrendwithin20hoursoftimeseriesdatainthefirstdiagram.

7.3. PATTERN ABSTRACTION IN PHYSIOLOGICAL DATA 113

Algorithm 7.1: Partial Trend Detection

Input: Set of segments, S = {s1 . . . sm}

Output: Set of trends, A = {a1 . . .al}, l � m

A ← ∅new trend a

foreach si ∈ S doif f(gradsi

, lensi) > 0 then

if si and si−1 are in different gradient thenadd a to A

new trend a

add si to a

elseif lensi

< λ thenif si and si−1 are in different gradient then

if si and si+1 are in same gradient thenadd a to A

new trend a

add si to a

elseadd si to a

else if gradsi< α then

add a to A

new trend a

7.3 Prototypical Pattern Abstraction in Physiological

Time Series Data

7.3.1 Background on Pattern Abstraction

Dealing with large time series with high granularities is typically a chal-lenge [134]. One of the objectives of this section is to find prototypical pat-terns in sequential data. This is mainly related to the general task of patternabstraction [82]. The main goal of prototypical pattern abstraction is to pro-vide a set of representative patterns from raw time series data, which includestwo phases: 1) discretisation and 2) clustering.

Discretisation or segmentation is a solution to transform a time seriest = (t1,· · · , tn) with n time points into a discrete sequence of segmentsS(t) : s1s2· · · sm, where generally m � n. Within different approaches fortime series discretisation [82], a sliding window method is the most commonlyused algorithm. In a sliding window approach, a time series t is discretised to aset of segments S(t) by sliding a window of size w with a given overlap on two

7.3.PATTERNABSTRACTIONINPHYSIOLOGICALDATA113

Algorithm7.1:PartialTrendDetection

Input:Setofsegments,S={s1...sm}

Output:Setoftrends,A={a1...al},l�m

A←∅newtrenda

foreachsi∈Sdoiff(gradsi,lensi)>0then

ifsiandsi−1areindifferentgradientthenaddatoA

newtrenda

addsitoa

elseiflensi<λthen

ifsiandsi−1areindifferentgradientthenifsiandsi+1areinsamegradientthen

addatoA

newtrenda

addsitoa

elseaddsitoa

elseifgradsi<αthenaddatoA

newtrenda

7.3PrototypicalPatternAbstractioninPhysiological

TimeSeriesData

7.3.1BackgroundonPatternAbstraction

Dealingwithlargetimeserieswithhighgranularitiesistypicallyachal-lenge[134].Oneoftheobjectivesofthissectionistofindprototypicalpat-ternsinsequentialdata.Thisismainlyrelatedtothegeneraltaskofpatternabstraction[82].Themaingoalofprototypicalpatternabstractionistopro-videasetofrepresentativepatternsfromrawtimeseriesdata,whichincludestwophases:1)discretisationand2)clustering.

Discretisationorsegmentationisasolutiontotransformatimeseriest=(t1,···,tn)withntimepointsintoadiscretesequenceofsegmentsS(t):s1s2···sm,wheregenerallym�n.Withindifferentapproachesfortimeseriesdiscretisation[82],aslidingwindowmethodisthemostcommonlyusedalgorithm.Inaslidingwindowapproach,atimeseriestisdiscretisedtoasetofsegmentsS(t)byslidingawindowofsizewwithagivenoverlapontwo


Algorithm 7.1: Partial Trend Detection

Input: Set of segments, S = {s1 . . . sm}

Output: Set of trends, A = {a1 . . .al}, l � m

A ← ∅new trend a

foreach si ∈ S doif f(gradsi

, lensi) > 0 then

if si and si−1 are in different gradient thenadd a to A

new trend a

add si to a

elseif lensi

< λ thenif si and si−1 are in different gradient then

if si and si+1 are in same gradient thenadd a to A

new trend a

add si to a

elseadd si to a

else if gradsi< α then

add a to A

new trend a

7.3 Prototypical Pattern Abstraction in Physiological

Time Series Data

7.3.1 Background on Pattern Abstraction

Dealing with large time series with high granularities is typically a chal-lenge [134]. One of the objectives of this section is to find prototypical pat-terns in sequential data. This is mainly related to the general task of patternabstraction [82]. The main goal of prototypical pattern abstraction is to pro-vide a set of representative patterns from raw time series data, which includestwo phases: 1) discretisation and 2) clustering.

Discretisation or segmentation is a solution to transform a time seriest = (t1,· · · , tn) with n time points into a discrete sequence of segmentsS(t) : s1s2· · · sm, where generally m � n. Within different approaches fortime series discretisation [82], a sliding window method is the most commonlyused algorithm. In a sliding window approach, a time series t is discretised to aset of segments S(t) by sliding a window of size w with a given overlap on two


Algorithm7.1:PartialTrendDetection

Input:Setofsegments,S={s1...sm}

Output:Setoftrends,A={a1...al},l�m

A←∅newtrenda

foreachsi∈Sdoiff(gradsi,lensi)>0then

ifsiandsi−1areindifferentgradientthenaddatoA

newtrenda

addsitoa

elseiflensi<λthen

ifsiandsi−1areindifferentgradientthenifsiandsi+1areinsamegradientthen

addatoA

newtrenda

addsitoa

elseaddsitoa

elseifgradsi<αthenaddatoA

newtrenda

7.3PrototypicalPatternAbstractioninPhysiological

TimeSeriesData

7.3.1BackgroundonPatternAbstraction

Dealingwithlargetimeserieswithhighgranularitiesistypicallyachal-lenge[134].Oneoftheobjectivesofthissectionistofindprototypicalpat-ternsinsequentialdata.Thisismainlyrelatedtothegeneraltaskofpatternabstraction[82].Themaingoalofprototypicalpatternabstractionistopro-videasetofrepresentativepatternsfromrawtimeseriesdata,whichincludestwophases:1)discretisationand2)clustering.

Discretisationorsegmentationisasolutiontotransformatimeseriest=(t1,···,tn)withntimepointsintoadiscretesequenceofsegmentsS(t):s1s2···sm,wheregenerallym�n.Withindifferentapproachesfortimeseriesdiscretisation[82],aslidingwindowmethodisthemostcommonlyusedalgorithm.Inaslidingwindowapproach,atimeseriestisdiscretisedtoasetofsegmentsS(t)byslidingawindowofsizewwithagivenoverlapontwo


Figure 7.5: The output of partial trend detection algorithm for two segmentedtime series (HR on top and RR on bottom).

consecutive windows. Each segment si = (ti1 ,· · · , tiw) is a subsequence of thetime series t, (1 � i � m). The provided segments are potentially the candidateto describe the unique attributes of the input data.

Clustering techniques are used for categorising the subsequences of time se-ries, in order to exploit a reasonable number of representative patterns fromnumerous segments. The advantage of using a clustering algorithm is that theprototypical patterns are provided in a data-driven way without involving anydomain knowledge to customise the typical patterns. Applying a mean normal-isation on each segment is a part of clustering progress to minimise the effectof amplitudes of segments. Clustering subsequences of time series is a challeng-ing matter discussed in the literature [71] and dependent on the discretisationalgorithm and distance method. Several clustering algorithms (e.g. k-means,hierarchical, DBscan, etc.) along the various distance measures [145] can beapplied on the data to cluster all the subsequences si ∈ S(t) of time series t intoa specific number of clusters (k). Cluster centres are considered as the proto-typical patterns of the time series. In other words, these patterns are capturedby averaging on the subsequences of clusters. Suppose Ct = {c1,· · · , ck} is theset of prototypical patterns (clustering centres) of time series t, in which a pro-totypical pattern cj = (t ′j1

,· · · , t ′jw) is not necessarily an exact subsequence oftime series t. Thus, in the sequence of segments S(t), by replacing each segmentsi with its cluster centre, the corresponding sequence of prototypical patternsP(t) is generated as: P(t) : p1· · ·pm, where pi ∈ Ct.


Figure7.5:Theoutputofpartialtrenddetectionalgorithmfortwosegmentedtimeseries(HRontopandRRonbottom).

consecutivewindows.Eachsegmentsi=(ti1,···,tiw)isasubsequenceofthetimeseriest,(1�i�m).Theprovidedsegmentsarepotentiallythecandidatetodescribetheuniqueattributesoftheinputdata.

Clusteringtechniquesareusedforcategorisingthesubsequencesoftimese-ries,inordertoexploitareasonablenumberofrepresentativepatternsfromnumeroussegments.Theadvantageofusingaclusteringalgorithmisthattheprototypicalpatternsareprovidedinadata-drivenwaywithoutinvolvinganydomainknowledgetocustomisethetypicalpatterns.Applyingameannormal-isationoneachsegmentisapartofclusteringprogresstominimisetheeffectofamplitudesofsegments.Clusteringsubsequencesoftimeseriesisachalleng-ingmatterdiscussedintheliterature[71]anddependentonthediscretisationalgorithmanddistancemethod.Severalclusteringalgorithms(e.g.k-means,hierarchical,DBscan,etc.)alongthevariousdistancemeasures[145]canbeappliedonthedatatoclusterallthesubsequencessi∈S(t)oftimeseriestintoaspecificnumberofclusters(k).Clustercentresareconsideredastheproto-typicalpatternsofthetimeseries.Inotherwords,thesepatternsarecapturedbyaveragingonthesubsequencesofclusters.SupposeCt={c1,···,ck}isthesetofprototypicalpatterns(clusteringcentres)oftimeseriest,inwhichapro-totypicalpatterncj=(t′

j1,···,t′jw)isnotnecessarilyanexactsubsequenceoftimeseriest.Thus,inthesequenceofsegmentsS(t),byreplacingeachsegmentsiwithitsclustercentre,thecorrespondingsequenceofprototypicalpatternsP(t)isgeneratedas:P(t):p1···pm,wherepi∈Ct.


Figure 7.5: The output of partial trend detection algorithm for two segmentedtime series (HR on top and RR on bottom).

consecutive windows. Each segment si = (ti1 ,· · · , tiw) is a subsequence of thetime series t, (1 � i � m). The provided segments are potentially the candidateto describe the unique attributes of the input data.

Clustering techniques are used for categorising the subsequences of time se-ries, in order to exploit a reasonable number of representative patterns fromnumerous segments. The advantage of using a clustering algorithm is that theprototypical patterns are provided in a data-driven way without involving anydomain knowledge to customise the typical patterns. Applying a mean normal-isation on each segment is a part of clustering progress to minimise the effectof amplitudes of segments. Clustering subsequences of time series is a challeng-ing matter discussed in the literature [71] and dependent on the discretisationalgorithm and distance method. Several clustering algorithms (e.g. k-means,hierarchical, DBscan, etc.) along the various distance measures [145] can beapplied on the data to cluster all the subsequences si ∈ S(t) of time series t intoa specific number of clusters (k). Cluster centres are considered as the proto-typical patterns of the time series. In other words, these patterns are capturedby averaging on the subsequences of clusters. Suppose Ct = {c1,· · · , ck} is theset of prototypical patterns (clustering centres) of time series t, in which a pro-totypical pattern cj = (t ′j1

,· · · , t ′jw) is not necessarily an exact subsequence oftime series t. Thus, in the sequence of segments S(t), by replacing each segmentsi with its cluster centre, the corresponding sequence of prototypical patternsP(t) is generated as: P(t) : p1· · ·pm, where pi ∈ Ct.


Figure7.5:Theoutputofpartialtrenddetectionalgorithmfortwosegmentedtimeseries(HRontopandRRonbottom).

consecutivewindows.Eachsegmentsi=(ti1,···,tiw)isasubsequenceofthetimeseriest,(1�i�m).Theprovidedsegmentsarepotentiallythecandidatetodescribetheuniqueattributesoftheinputdata.

Clusteringtechniquesareusedforcategorisingthesubsequencesoftimese-ries,inordertoexploitareasonablenumberofrepresentativepatternsfromnumeroussegments.Theadvantageofusingaclusteringalgorithmisthattheprototypicalpatternsareprovidedinadata-drivenwaywithoutinvolvinganydomainknowledgetocustomisethetypicalpatterns.Applyingameannormal-isationoneachsegmentisapartofclusteringprogresstominimisetheeffectofamplitudesofsegments.Clusteringsubsequencesoftimeseriesisachalleng-ingmatterdiscussedintheliterature[71]anddependentonthediscretisationalgorithmanddistancemethod.Severalclusteringalgorithms(e.g.k-means,hierarchical,DBscan,etc.)alongthevariousdistancemeasures[145]canbeappliedonthedatatoclusterallthesubsequencessi∈S(t)oftimeseriestintoaspecificnumberofclusters(k).Clustercentresareconsideredastheproto-typicalpatternsofthetimeseries.Inotherwords,thesepatternsarecapturedbyaveragingonthesubsequencesofclusters.SupposeCt={c1,···,ck}isthesetofprototypicalpatterns(clusteringcentres)oftimeseriest,inwhichapro-totypicalpatterncj=(t′

j1,···,t′jw)isnotnecessarilyanexactsubsequenceoftimeseriest.Thus,inthesequenceofsegmentsS(t),byreplacingeachsegmentsiwithitsclustercentre,thecorrespondingsequenceofprototypicalpatternsP(t)isgeneratedas:P(t):p1···pm,wherepi∈Ct.


Figure 7.6: An example of trend detection outputs for two different resolutionsof heart rate data. Top: 22 hours data with detected trends. Bottom: 4.5 hoursdata captured from the last trend in the first diagram.

7.3.2 Prototypical Pattern Abstraction

The considered measurement for each condition includes three mentioned vari-ables HR, BP, and RR. Suppose three time series tHR, tBP, and tRR, with thesame length of n corresponds to the measurements in HR, BP, and RR, respec-tively.

Discretisation: In order to provide the sequence of prototypical patterns foreach time series, the first step is a discretisation method, introduced in Sec-tion 7.3.1. Discretisation of time series data needs first to determine the sizeof sliding window (w). Since this approach aims to provide a set of descriptiverules based on the patterns, a meaningful range of values for the size of the slid-ing window (w) is tested. The length of overlap of two consecutive windows isset to half of the window’s size, in order to avoid certain breaks in the signalsthat might lead to losing a segment involving prototypical patterns. By applyingthe discretisation method to the time series tHR, tBP, and tRR, the sequences of


Figure7.6:Anexampleoftrenddetectionoutputsfortwodifferentresolutionsofheartratedata.Top:22hoursdatawithdetectedtrends.Bottom:4.5hoursdatacapturedfromthelasttrendinthefirstdiagram.

7.3.2PrototypicalPatternAbstraction

Theconsideredmeasurementforeachconditionincludesthreementionedvari-ablesHR,BP,andRR.SupposethreetimeseriestHR,tBP,andtRR,withthesamelengthofncorrespondstothemeasurementsinHR,BP,andRR,respec-tively.

Discretisation:Inordertoprovidethesequenceofprototypicalpatternsforeachtimeseries,thefirststepisadiscretisationmethod,introducedinSec-tion7.3.1.Discretisationoftimeseriesdataneedsfirsttodeterminethesizeofslidingwindow(w).Sincethisapproachaimstoprovideasetofdescriptiverulesbasedonthepatterns,ameaningfulrangeofvaluesforthesizeoftheslid-ingwindow(w)istested.Thelengthofoverlapoftwoconsecutivewindowsissettohalfofthewindow’ssize,inordertoavoidcertainbreaksinthesignalsthatmightleadtolosingasegmentinvolvingprototypicalpatterns.ByapplyingthediscretisationmethodtothetimeseriestHR,tBP,andtRR,thesequencesof


Figure 7.6: An example of trend detection outputs for two different resolutionsof heart rate data. Top: 22 hours data with detected trends. Bottom: 4.5 hoursdata captured from the last trend in the first diagram.

7.3.2 Prototypical Pattern Abstraction

The considered measurement for each condition includes three mentioned vari-ables HR, BP, and RR. Suppose three time series tHR, tBP, and tRR, with thesame length of n corresponds to the measurements in HR, BP, and RR, respec-tively.

Discretisation: In order to provide the sequence of prototypical patterns foreach time series, the first step is a discretisation method, introduced in Sec-tion 7.3.1. Discretisation of time series data needs first to determine the sizeof sliding window (w). Since this approach aims to provide a set of descriptiverules based on the patterns, a meaningful range of values for the size of the slid-ing window (w) is tested. The length of overlap of two consecutive windows isset to half of the window’s size, in order to avoid certain breaks in the signalsthat might lead to losing a segment involving prototypical patterns. By applyingthe discretisation method to the time series tHR, tBP, and tRR, the sequences of


Figure7.6:Anexampleoftrenddetectionoutputsfortwodifferentresolutionsofheartratedata.Top:22hoursdatawithdetectedtrends.Bottom:4.5hoursdatacapturedfromthelasttrendinthefirstdiagram.

7.3.2PrototypicalPatternAbstraction

Theconsideredmeasurementforeachconditionincludesthreementionedvari-ablesHR,BP,andRR.SupposethreetimeseriestHR,tBP,andtRR,withthesamelengthofncorrespondstothemeasurementsinHR,BP,andRR,respec-tively.

Discretisation:Inordertoprovidethesequenceofprototypicalpatternsforeachtimeseries,thefirststepisadiscretisationmethod,introducedinSec-tion7.3.1.Discretisationoftimeseriesdataneedsfirsttodeterminethesizeofslidingwindow(w).Sincethisapproachaimstoprovideasetofdescriptiverulesbasedonthepatterns,ameaningfulrangeofvaluesforthesizeoftheslid-ingwindow(w)istested.Thelengthofoverlapoftwoconsecutivewindowsissettohalfofthewindow’ssize,inordertoavoidcertainbreaksinthesignalsthatmightleadtolosingasegmentinvolvingprototypicalpatterns.ByapplyingthediscretisationmethodtothetimeseriestHR,tBP,andtRR,thesequencesof


segments will be obtained for physiological variables as S(tHR), S(tBP), andS(tRR), where |S(tv)| = 2 × (n/w) − 1, and v ∈ {HR, BP, RR}.

Clustering: The next phase is to extract the prototypical patterns of eachtime series using clustering methods. Here, a k-means clustering is applied toeach set of segments, to categorise the segments into a set of clusters (k). Inthis algorithm, k segments are selected as initial centres. Then other segmentsare assigned to these centres based on their similarity, and the centre of eachcluster is updated. This process is repeated until the centres do not change[106]. To optimise the number of clusters [121], a range of values is validatedby considering the modelling results while the clustering approach is used fortemporal rule mining task (this will be described in Section 8.1.4).

Before applying clustering, each subsequence is prepared as follows: If thereare several artefacts in a subsequence, then this subsequence is not consideredfor further processing. The maximum number of allowed artefacts in a subse-quence should be less than half of the length of the subsequence. If the numberof artefacts in the segment’s values exceeds a defined threshold, the segmentsi is removed from S(tv). Otherwise, the artefacts will be replaced with thevalues given by an interpolation method (i.e. cubic interpolation). After that,each segment si ∈ S(tv) (with the average value μsi

) is normalised to get zeromean by subtracting the μsi

from all values of si. This normalisation will inval-idate the amplitude of segment values. So, the focus will be given to the trendsof segments while clustering applies. The normalisation is crucial, because thesegments with the same shape and trend need to be categorised in the samecluster, rather than the segments with a similar range of values. The k-meansalgorithm classifies the pre-processed segments of S(tv) into k clusters, with theset of centres Ctv . Then the corresponding sequence of the prototypical patternsP(tv) is defined as: P(tv) : p1· · ·p|S(tv)|, where pi ∈ Ctv and 1 � i � |S(tv)|.

It is worth noting that applying various distance functions implies the var-ious amount of computational effort. The Euclidean distance is employed inthe clustering algorithm since it provides an efficient computation of the dis-tances between segments. Moreover, different clustering algorithms with var-ious distance functions (e.g. Euclidean distance, dynamic time warping, etc.)will provide distinct clusters and construct multiple prototypical patterns [172].However, due to the time complexity of the other algorithms on big data, the k-means algorithm with Euclidean distance is employed in the clustering progress,which is not guaranteed to be the optimal, but sufficient for the goal of findingmeaningful patterns.

An example of abstracted prototypical patterns from a sensor reading isshown in Figure 7.7a, which depicts the cluster centres obtained from HR sen-sor data in CHF condition (w = 180, k = 7). Figure 7.7b presents the sequenceof prototypical patterns (PtHR

) for the first two hours of HR data shown inFigure 7.3.


segmentswillbeobtainedforphysiologicalvariablesasS(tHR),S(tBP),andS(tRR),where|S(tv)|=2×(n/w)−1,andv∈{HR,BP,RR}.

Clustering:Thenextphaseistoextracttheprototypicalpatternsofeachtimeseriesusingclusteringmethods.Here,ak-meansclusteringisappliedtoeachsetofsegments,tocategorisethesegmentsintoasetofclusters(k).Inthisalgorithm,ksegmentsareselectedasinitialcentres.Thenothersegmentsareassignedtothesecentresbasedontheirsimilarity,andthecentreofeachclusterisupdated.Thisprocessisrepeateduntilthecentresdonotchange[106].Tooptimisethenumberofclusters[121],arangeofvaluesisvalidatedbyconsideringthemodellingresultswhiletheclusteringapproachisusedfortemporalruleminingtask(thiswillbedescribedinSection8.1.4).

Beforeapplyingclustering,eachsubsequenceispreparedasfollows:Ifthereareseveralartefactsinasubsequence,thenthissubsequenceisnotconsideredforfurtherprocessing.Themaximumnumberofallowedartefactsinasubse-quenceshouldbelessthanhalfofthelengthofthesubsequence.Ifthenumberofartefactsinthesegment’svaluesexceedsadefinedthreshold,thesegmentsiisremovedfromS(tv).Otherwise,theartefactswillbereplacedwiththevaluesgivenbyaninterpolationmethod(i.e.cubicinterpolation).Afterthat,eachsegmentsi∈S(tv)(withtheaveragevalueμsi)isnormalisedtogetzeromeanbysubtractingtheμsifromallvaluesofsi.Thisnormalisationwillinval-idatetheamplitudeofsegmentvalues.So,thefocuswillbegiventothetrendsofsegmentswhileclusteringapplies.Thenormalisationiscrucial,becausethesegmentswiththesameshapeandtrendneedtobecategorisedinthesamecluster,ratherthanthesegmentswithasimilarrangeofvalues.Thek-meansalgorithmclassifiesthepre-processedsegmentsofS(tv)intokclusters,withthesetofcentresCtv.ThenthecorrespondingsequenceoftheprototypicalpatternsP(tv)isdefinedas:P(tv):p1···p|S(tv)|,wherepi∈Ctvand1�i�|S(tv)|.

Itisworthnotingthatapplyingvariousdistancefunctionsimpliesthevar-iousamountofcomputationaleffort.TheEuclideandistanceisemployedintheclusteringalgorithmsinceitprovidesanefficientcomputationofthedis-tancesbetweensegments.Moreover,differentclusteringalgorithmswithvar-iousdistancefunctions(e.g.Euclideandistance,dynamictimewarping,etc.)willprovidedistinctclustersandconstructmultipleprototypicalpatterns[172].However,duetothetimecomplexityoftheotheralgorithmsonbigdata,thek-meansalgorithmwithEuclideandistanceisemployedintheclusteringprogress,whichisnotguaranteedtobetheoptimal,butsufficientforthegoaloffindingmeaningfulpatterns.

AnexampleofabstractedprototypicalpatternsfromasensorreadingisshowninFigure7.7a,whichdepictstheclustercentresobtainedfromHRsen-sordatainCHFcondition(w=180,k=7).Figure7.7bpresentsthesequenceofprototypicalpatterns(PtHR)forthefirsttwohoursofHRdatashowninFigure7.3.


segments will be obtained for physiological variables as S(tHR), S(tBP), andS(tRR), where |S(tv)| = 2 × (n/w) − 1, and v ∈ {HR, BP, RR}.

Clustering: The next phase is to extract the prototypical patterns of eachtime series using clustering methods. Here, a k-means clustering is applied toeach set of segments, to categorise the segments into a set of clusters (k). Inthis algorithm, k segments are selected as initial centres. Then other segmentsare assigned to these centres based on their similarity, and the centre of eachcluster is updated. This process is repeated until the centres do not change[106]. To optimise the number of clusters [121], a range of values is validatedby considering the modelling results while the clustering approach is used fortemporal rule mining task (this will be described in Section 8.1.4).

Before applying clustering, each subsequence is prepared as follows: If thereare several artefacts in a subsequence, then this subsequence is not consideredfor further processing. The maximum number of allowed artefacts in a subse-quence should be less than half of the length of the subsequence. If the numberof artefacts in the segment’s values exceeds a defined threshold, the segmentsi is removed from S(tv). Otherwise, the artefacts will be replaced with thevalues given by an interpolation method (i.e. cubic interpolation). After that,each segment si ∈ S(tv) (with the average value μsi

) is normalised to get zeromean by subtracting the μsi

from all values of si. This normalisation will inval-idate the amplitude of segment values. So, the focus will be given to the trendsof segments while clustering applies. The normalisation is crucial, because thesegments with the same shape and trend need to be categorised in the samecluster, rather than the segments with a similar range of values. The k-meansalgorithm classifies the pre-processed segments of S(tv) into k clusters, with theset of centres Ctv . Then the corresponding sequence of the prototypical patternsP(tv) is defined as: P(tv) : p1· · ·p|S(tv)|, where pi ∈ Ctv and 1 � i � |S(tv)|.

It is worth noting that applying various distance functions implies the var-ious amount of computational effort. The Euclidean distance is employed inthe clustering algorithm since it provides an efficient computation of the dis-tances between segments. Moreover, different clustering algorithms with var-ious distance functions (e.g. Euclidean distance, dynamic time warping, etc.)will provide distinct clusters and construct multiple prototypical patterns [172].However, due to the time complexity of the other algorithms on big data, the k-means algorithm with Euclidean distance is employed in the clustering progress,which is not guaranteed to be the optimal, but sufficient for the goal of findingmeaningful patterns.

An example of abstracted prototypical patterns from a sensor reading isshown in Figure 7.7a, which depicts the cluster centres obtained from HR sen-sor data in CHF condition (w = 180, k = 7). Figure 7.7b presents the sequenceof prototypical patterns (PtHR

) for the first two hours of HR data shown inFigure 7.3.


segmentswillbeobtainedforphysiologicalvariablesasS(tHR),S(tBP),andS(tRR),where|S(tv)|=2×(n/w)−1,andv∈{HR,BP,RR}.

Clustering:Thenextphaseistoextracttheprototypicalpatternsofeachtimeseriesusingclusteringmethods.Here,ak-meansclusteringisappliedtoeachsetofsegments,tocategorisethesegmentsintoasetofclusters(k).Inthisalgorithm,ksegmentsareselectedasinitialcentres.Thenothersegmentsareassignedtothesecentresbasedontheirsimilarity,andthecentreofeachclusterisupdated.Thisprocessisrepeateduntilthecentresdonotchange[106].Tooptimisethenumberofclusters[121],arangeofvaluesisvalidatedbyconsideringthemodellingresultswhiletheclusteringapproachisusedfortemporalruleminingtask(thiswillbedescribedinSection8.1.4).

Beforeapplyingclustering,eachsubsequenceispreparedasfollows:Ifthereareseveralartefactsinasubsequence,thenthissubsequenceisnotconsideredforfurtherprocessing.Themaximumnumberofallowedartefactsinasubse-quenceshouldbelessthanhalfofthelengthofthesubsequence.Ifthenumberofartefactsinthesegment’svaluesexceedsadefinedthreshold,thesegmentsiisremovedfromS(tv).Otherwise,theartefactswillbereplacedwiththevaluesgivenbyaninterpolationmethod(i.e.cubicinterpolation).Afterthat,eachsegmentsi∈S(tv)(withtheaveragevalueμsi)isnormalisedtogetzeromeanbysubtractingtheμsifromallvaluesofsi.Thisnormalisationwillinval-idatetheamplitudeofsegmentvalues.So,thefocuswillbegiventothetrendsofsegmentswhileclusteringapplies.Thenormalisationiscrucial,becausethesegmentswiththesameshapeandtrendneedtobecategorisedinthesamecluster,ratherthanthesegmentswithasimilarrangeofvalues.Thek-meansalgorithmclassifiesthepre-processedsegmentsofS(tv)intokclusters,withthesetofcentresCtv.ThenthecorrespondingsequenceoftheprototypicalpatternsP(tv)isdefinedas:P(tv):p1···p|S(tv)|,wherepi∈Ctvand1�i�|S(tv)|.

Itisworthnotingthatapplyingvariousdistancefunctionsimpliesthevar-iousamountofcomputationaleffort.TheEuclideandistanceisemployedintheclusteringalgorithmsinceitprovidesanefficientcomputationofthedis-tancesbetweensegments.Moreover,differentclusteringalgorithmswithvar-iousdistancefunctions(e.g.Euclideandistance,dynamictimewarping,etc.)willprovidedistinctclustersandconstructmultipleprototypicalpatterns[172].However,duetothetimecomplexityoftheotheralgorithmsonbigdata,thek-meansalgorithmwithEuclideandistanceisemployedintheclusteringprogress,whichisnotguaranteedtobetheoptimal,butsufficientforthegoaloffindingmeaningfulpatterns.

AnexampleofabstractedprototypicalpatternsfromasensorreadingisshowninFigure7.7a,whichdepictstheclustercentresobtainedfromHRsen-sordatainCHFcondition(w=180,k=7).Figure7.7bpresentsthesequenceofprototypicalpatterns(PtHR)forthefirsttwohoursofHRdatashowninFigure7.3.

7.4. DISCUSSION AND SUMMARY 117

50 100 150

−0.05

0

0.05

50 100 150−5

0

5

50 100 150

−2

0

2

50 100 150

−2

0

2

50 100 150−2

0

2

50 100 150−5

0

5

50 100 150

−1

0

1

2

(a) Cluster centres (CtHR) as prototypical patterns.

−8

0

8

time (2 hours)

be

ats

/min

(b) Sequence of prototypical patterns (PtHR: p1· · ·pm).

Figure 7.7: An example of prototypical patterns for HR data in CHF condition.

7.4 Discussion and Summary

This chapter has first presented the process of collecting and acquiring the inputphysiological sensor data sets. This process involved collecting data from wear-able sensors as well as acquiring proper physiological sensor data from clinicalconditions. Afterwards, this chapter has provided two main processes to anal-yse time series sensor data to catch the crucial behaviours of data. First, a trenddetection method has been presented to capture the partial trends among thetime series. Then, a pattern abstraction approach is introduced to extract theprototypical patterns throughout the time series. These prototypical patternsare the representative abstractions of the all possible time series subsequenceswithin the data. The importance of extracting such prototypical patterns is tobe able to determine which behaviours repeatedly occur in the time series. Thisinformation is data-driven in a sense that there is no prior knowledge aboutthe patterns or no expert knowledge to define them beforehand. Also, detectingsuch prototypical patterns is not a trivial task for an expert such as cliniciansor doctors by just exploring the recorded time series data. Therefore, these pro-totypical patterns are beneficial for further investigations.

There are some limitations related to the data preprocessing and data clean-ing methods. For both trend detection and pattern abstraction algorithms, sev-

7.4.DISCUSSIONANDSUMMARY117

50100150

−0.05

0

0.05

50100150−5

0

5

50100150

−2

0

2

50100150

−2

0

2

50100150−2

0

2

50100150−5

0

5

50100150

−1

0

1

2

(a)Clustercentres(CtHR)asprototypicalpatterns.

−8

0

8

time (2 hours)

be

ats

/min

(b)Sequenceofprototypicalpatterns(PtHR:p1···pm).

Figure7.7:AnexampleofprototypicalpatternsforHRdatainCHFcondition.

7.4DiscussionandSummary

Thischapterhasfirstpresentedtheprocessofcollectingandacquiringtheinputphysiologicalsensordatasets.Thisprocessinvolvedcollectingdatafromwear-ablesensorsaswellasacquiringproperphysiologicalsensordatafromclinicalconditions.Afterwards,thischapterhasprovidedtwomainprocessestoanal-ysetimeseriessensordatatocatchthecrucialbehavioursofdata.First,atrenddetectionmethodhasbeenpresentedtocapturethepartialtrendsamongthetimeseries.Then,apatternabstractionapproachisintroducedtoextracttheprototypicalpatternsthroughoutthetimeseries.Theseprototypicalpatternsaretherepresentativeabstractionsoftheallpossibletimeseriessubsequenceswithinthedata.Theimportanceofextractingsuchprototypicalpatternsistobeabletodeterminewhichbehavioursrepeatedlyoccurinthetimeseries.Thisinformationisdata-driveninasensethatthereisnopriorknowledgeaboutthepatternsornoexpertknowledgetodefinethembeforehand.Also,detectingsuchprototypicalpatternsisnotatrivialtaskforanexpertsuchascliniciansordoctorsbyjustexploringtherecordedtimeseriesdata.Therefore,thesepro-totypicalpatternsarebeneficialforfurtherinvestigations.

Therearesomelimitationsrelatedtothedatapreprocessinganddataclean-ingmethods.Forbothtrenddetectionandpatternabstractionalgorithms,sev-


50 100 150

−0.05

0

0.05

50 100 150−5

0

5

50 100 150

−2

0

2

50 100 150

−2

0

2

50 100 150−2

0

2

50 100 150−5

0

5

50 100 150

−1

0

1

2

(a) Cluster centres (CtHR) as prototypical patterns.

−8

0

8

time (2 hours)

be

ats

/min

(b) Sequence of prototypical patterns (PtHR: p1· · ·pm).

Figure 7.7: An example of prototypical patterns for HR data in CHF condition.


This chapter has first presented the process of collecting and acquiring the inputphysiological sensor data sets. This process involved collecting data from wear-able sensors as well as acquiring proper physiological sensor data from clinicalconditions. Afterwards, this chapter has provided two main processes to anal-yse time series sensor data to catch the crucial behaviours of data. First, a trenddetection method has been presented to capture the partial trends among thetime series. Then, a pattern abstraction approach is introduced to extract theprototypical patterns throughout the time series. These prototypical patternsare the representative abstractions of the all possible time series subsequenceswithin the data. The importance of extracting such prototypical patterns is tobe able to determine which behaviours repeatedly occur in the time series. Thisinformation is data-driven in a sense that there is no prior knowledge aboutthe patterns or no expert knowledge to define them beforehand. Also, detectingsuch prototypical patterns is not a trivial task for an expert such as cliniciansor doctors by just exploring the recorded time series data. Therefore, these pro-totypical patterns are beneficial for further investigations.

There are some limitations related to the data preprocessing and data clean-ing methods. For both trend detection and pattern abstraction algorithms, sev-


50100150

−0.05

0

0.05

50100150−5

0

5

50100150

−2

0

2

50100150

−2

0

2

50100150−2

0

2

50100150−5

0

5

50100150

−1

0

1

2

(a)Clustercentres(CtHR)asprototypicalpatterns.

−8

0

8

time (2 hours)

be

ats

/min

(b)Sequenceofprototypicalpatterns(PtHR:p1···pm).

Figure7.7:AnexampleofprototypicalpatternsforHRdatainCHFcondition.


Thischapterhasfirstpresentedtheprocessofcollectingandacquiringtheinputphysiologicalsensordatasets.Thisprocessinvolvedcollectingdatafromwear-ablesensorsaswellasacquiringproperphysiologicalsensordatafromclinicalconditions.Afterwards,thischapterhasprovidedtwomainprocessestoanal-ysetimeseriessensordatatocatchthecrucialbehavioursofdata.First,atrenddetectionmethodhasbeenpresentedtocapturethepartialtrendsamongthetimeseries.Then,apatternabstractionapproachisintroducedtoextracttheprototypicalpatternsthroughoutthetimeseries.Theseprototypicalpatternsaretherepresentativeabstractionsoftheallpossibletimeseriessubsequenceswithinthedata.Theimportanceofextractingsuchprototypicalpatternsistobeabletodeterminewhichbehavioursrepeatedlyoccurinthetimeseries.Thisinformationisdata-driveninasensethatthereisnopriorknowledgeaboutthepatternsornoexpertknowledgetodefinethembeforehand.Also,detectingsuchprototypicalpatternsisnotatrivialtaskforanexpertsuchascliniciansordoctorsbyjustexploringtherecordedtimeseriesdata.Therefore,thesepro-totypicalpatternsarebeneficialforfurtherinvestigations.

Therearesomelimitationsrelatedtothedatapreprocessinganddataclean-ingmethods.Forbothtrenddetectionandpatternabstractionalgorithms,sev-


eral parameters have been set (either learned or just selected heuristically) thatare dependent on the input recorded data. For example, in the pattern abstrac-tion approach, an automatic approach has been suggested to select the param-eters of window size and number of clusters by using the given set of data.However, providing a general method to automatically perform the parameterselection task for any input data set is not investigated in this study. Besides,the computational cost of the proposed algorithms (especially for pattern ab-straction) is not optimised. There is a lack studying the suitable mechanism tominimise the computational cost of the approaches. This point will be criticalif the size of input data grows.

In sum, this chapter shows how the raw numerical sensor data can be pre-pared and processes in order to either 1) be used for more complex data min-ings such as temporal rule mining among the patterns (Section 8.1), 2) be di-rectly mapped to linguistic descriptions using template-based approaches (Sec-tion 8.2), or 3) be fed as perceived information to a semantic representation tobe turned to linguistic descriptions using the semantic inference (Chapter 9).


eralparametershavebeenset(eitherlearnedorjustselectedheuristically)thataredependentontheinputrecordeddata.Forexample,inthepatternabstrac-tionapproach,anautomaticapproachhasbeensuggestedtoselecttheparam-etersofwindowsizeandnumberofclustersbyusingthegivensetofdata.However,providingageneralmethodtoautomaticallyperformtheparameterselectiontaskforanyinputdatasetisnotinvestigatedinthisstudy.Besides,thecomputationalcostoftheproposedalgorithms(especiallyforpatternab-straction)isnotoptimised.Thereisalackstudyingthesuitablemechanismtominimisethecomputationalcostoftheapproaches.Thispointwillbecriticalifthesizeofinputdatagrows.

Insum,thischaptershowshowtherawnumericalsensordatacanbepre-paredandprocessesinordertoeither1)beusedformorecomplexdatamin-ingssuchastemporalruleminingamongthepatterns(Section8.1),2)bedi-rectlymappedtolinguisticdescriptionsusingtemplate-basedapproaches(Sec-tion8.2),or3)befedasperceivedinformationtoasemanticrepresentationtobeturnedtolinguisticdescriptionsusingthesemanticinference(Chapter9).


eral parameters have been set (either learned or just selected heuristically) thatare dependent on the input recorded data. For example, in the pattern abstrac-tion approach, an automatic approach has been suggested to select the param-eters of window size and number of clusters by using the given set of data.However, providing a general method to automatically perform the parameterselection task for any input data set is not investigated in this study. Besides,the computational cost of the proposed algorithms (especially for pattern ab-straction) is not optimised. There is a lack studying the suitable mechanism tominimise the computational cost of the approaches. This point will be criticalif the size of input data grows.

In sum, this chapter shows how the raw numerical sensor data can be pre-pared and processes in order to either 1) be used for more complex data min-ings such as temporal rule mining among the patterns (Section 8.1), 2) be di-rectly mapped to linguistic descriptions using template-based approaches (Sec-tion 8.2), or 3) be fed as perceived information to a semantic representation tobe turned to linguistic descriptions using the semantic inference (Chapter 9).


eralparametershavebeenset(eitherlearnedorjustselectedheuristically)thataredependentontheinputrecordeddata.Forexample,inthepatternabstrac-tionapproach,anautomaticapproachhasbeensuggestedtoselecttheparam-etersofwindowsizeandnumberofclustersbyusingthegivensetofdata.However,providingageneralmethodtoautomaticallyperformtheparameterselectiontaskforanyinputdatasetisnotinvestigatedinthisstudy.Besides,thecomputationalcostoftheproposedalgorithms(especiallyforpatternab-straction)isnotoptimised.Thereisalackstudyingthesuitablemechanismtominimisethecomputationalcostoftheapproaches.Thispointwillbecriticalifthesizeofinputdatagrows.

Insum,thischaptershowshowtherawnumericalsensordatacanbepre-paredandprocessesinordertoeither1)beusedformorecomplexdatamin-ingssuchastemporalruleminingamongthepatterns(Section8.1),2)bedi-rectlymappedtolinguisticdescriptionsusingtemplate-basedapproaches(Sec-tion8.2),or3)befedasperceivedinformationtoasemanticrepresentationtobeturnedtolinguisticdescriptionsusingthesemanticinference(Chapter9).

Chapter 8

Mining and Describing

Physiological Time Series Data

“Nothing in the world is more exciting than a moment ofsudden discovery or invention, and many more people arecapable of experiencing such moments than is sometimesthought.”

— Bertrand Russell (1872–1970)

Following the structure of the thesis (Figure 1.2), a data preparation ap-proach has been presented in Chapter 7 to provide information for seman-

tic representations. But before showing how to apply the semantic representa-tion approaches on physiological patterns, this chapter considers, mining morecomplex information, and using template-based NLG approaches to map thenumerical data to the linguistic descriptions directly.

The output trends and patterns from data analysis can be considered asinput for different aspects of the work on the application side. Two sectionsof this chapter present two aspects of data processing and interpreting suchprepared patterns and trends. 1) The first section introduces a data processingupon the abstracted patterns by automatically mining temporal rules from thephysiological sensor data in clinical conditions. This mining of temporal ruleswill enrich the processed information into more useful knowledge. 2) The sec-ond section proposes simple linguistic description approaches to interoperatethe mind information in natural language, for all the processed information:detected trends, abstracted patterns, and temporal rules.

119

Chapter8

MiningandDescribing

PhysiologicalTimeSeriesData

“Nothingintheworldismoreexcitingthanamomentofsuddendiscoveryorinvention,andmanymorepeoplearecapableofexperiencingsuchmomentsthanissometimesthought.”

—BertrandRussell(1872–1970)

Followingthestructureofthethesis(Figure1.2),adatapreparationap-proachhasbeenpresentedinChapter7toprovideinformationforseman-

ticrepresentations.Butbeforeshowinghowtoapplythesemanticrepresenta-tionapproachesonphysiologicalpatterns,thischapterconsiders,miningmorecomplexinformation,andusingtemplate-basedNLGapproachestomapthenumericaldatatothelinguisticdescriptionsdirectly.

Theoutputtrendsandpatternsfromdataanalysiscanbeconsideredasinputfordifferentaspectsoftheworkontheapplicationside.Twosectionsofthischapterpresenttwoaspectsofdataprocessingandinterpretingsuchpreparedpatternsandtrends.1)Thefirstsectionintroducesadataprocessingupontheabstractedpatternsbyautomaticallyminingtemporalrulesfromthephysiologicalsensordatainclinicalconditions.Thisminingoftemporalruleswillenrichtheprocessedinformationintomoreusefulknowledge.2)Thesec-ondsectionproposessimplelinguisticdescriptionapproachestointeroperatethemindinformationinnaturallanguage,foralltheprocessedinformation:detectedtrends,abstractedpatterns,andtemporalrules.

119

Chapter 8

Mining and Describing

Physiological Time Series Data

“Nothing in the world is more exciting than a moment ofsudden discovery or invention, and many more people arecapable of experiencing such moments than is sometimesthought.”

— Bertrand Russell (1872–1970)

Following the structure of the thesis (Figure 1.2), a data preparation ap-proach has been presented in Chapter 7 to provide information for seman-

tic representations. But before showing how to apply the semantic representa-tion approaches on physiological patterns, this chapter considers, mining morecomplex information, and using template-based NLG approaches to map thenumerical data to the linguistic descriptions directly.

The output trends and patterns from data analysis can be considered asinput for different aspects of the work on the application side. Two sectionsof this chapter present two aspects of data processing and interpreting suchprepared patterns and trends. 1) The first section introduces a data processingupon the abstracted patterns by automatically mining temporal rules from thephysiological sensor data in clinical conditions. This mining of temporal ruleswill enrich the processed information into more useful knowledge. 2) The sec-ond section proposes simple linguistic description approaches to interoperatethe mind information in natural language, for all the processed information:detected trends, abstracted patterns, and temporal rules.

119

Chapter8

MiningandDescribing

PhysiologicalTimeSeriesData

“Nothingintheworldismoreexcitingthanamomentofsuddendiscoveryorinvention,andmanymorepeoplearecapableofexperiencingsuchmomentsthanissometimesthought.”

—BertrandRussell(1872–1970)

Followingthestructureofthethesis(Figure1.2),adatapreparationap-proachhasbeenpresentedinChapter7toprovideinformationforseman-

ticrepresentations.Butbeforeshowinghowtoapplythesemanticrepresenta-tionapproachesonphysiologicalpatterns,thischapterconsiders,miningmorecomplexinformation,andusingtemplate-basedNLGapproachestomapthenumericaldatatothelinguisticdescriptionsdirectly.

Theoutputtrendsandpatternsfromdataanalysiscanbeconsideredasinputfordifferentaspectsoftheworkontheapplicationside.Twosectionsofthischapterpresenttwoaspectsofdataprocessingandinterpretingsuchpreparedpatternsandtrends.1)Thefirstsectionintroducesadataprocessingupontheabstractedpatternsbyautomaticallyminingtemporalrulesfromthephysiologicalsensordatainclinicalconditions.Thisminingoftemporalruleswillenrichtheprocessedinformationintomoreusefulknowledge.2)Thesec-ondsectionproposessimplelinguisticdescriptionapproachestointeroperatethemindinformationinnaturallanguage,foralltheprocessedinformation:detectedtrends,abstractedpatterns,andtemporalrules.

119

120 CHAPTER 8. MINING & DESCRIBING PHYSIO. DATA

8.1 Mining Temporal Rules in Physiological Sensor

Data

This section is dedicated to present temporal rule mining as a data-driven anal-ysis of the abstracted patterns. The generated set of rules captures the temporalrelationships between patterns of physiological data. This rule mining approachis presented in this thesis to be used later in a template-based linguistic descrip-tion approach (explained in Section 8.2). This section first presents an overviewof association rule mining methods in general and mining temporal rules in themedical domain. Afterwards, it gives details on the proposed approach to ex-tract temporal rules from multi-channels of physiological time series data. Theoutput rules for clinical conditions are then compared using a rule similaritymeasure that highlights the uniqueness of the rules in each condition.

8.1.1 Background on Temporal Rule Mining

Temporal rule mining is a promising approach to generate meaningful associa-tion rules from sequential data [197]. This section first describes the standardassociation rule mining method. Then it reviews rule mining approaches fortemporal data in the medical domain.

Association Rule Discovery

Suppose I = {i1,· · · , id} is a set of items (e.g. all the products in a store), andD = {d1,· · · ,dN} is a transactional database with N transactions (e.g. all theshopping lists in a year). The support of an itemset A ⊂ I is the frequency ofthe occurrence of A in all the transactions of D. The standard association rulemining provides a set of rules in form of A ⇒ B. In this rule, A is antecedent andB is the consequent, which are disjoint itemsets. Generally, a rule like A ⇒ B

means if the items of A occur in a transaction di, then the items of B alsowill plausibly appear in di. Typical measures to show the strength of a ruleare support (sup) and confidence (conf). Support of a rule shows how oftenthe rule itemsets occur in the database. Further, the confidence of rule A ⇒ B

determines how frequently the itemset B occurs in transactions which containitemset A. Let PD(A) be the probability of the occurrence of an itemset A in D.Then, support and confidence of the rule A ⇒ B are defined as [220]:

sup(A ⇒ B) = PD(A ∪ B), (8.1)

conf(A ⇒ B) = pD(A|B) = sup(A ⇒ B)/pD(A) (8.2)

The rules with sufficient support and confidence are typically known asstrong rules. The values of minsup and minconf are specified as the thresholdsfor strong and meaningful rules. Association rules with low support may have

120CHAPTER8.MINING&DESCRIBINGPHYSIO.DATA

8.1MiningTemporalRulesinPhysiologicalSensor

Data

Thissectionisdedicatedtopresenttemporalruleminingasadata-drivenanal-ysisoftheabstractedpatterns.Thegeneratedsetofrulescapturesthetemporalrelationshipsbetweenpatternsofphysiologicaldata.Thisruleminingapproachispresentedinthisthesistobeusedlaterinatemplate-basedlinguisticdescrip-tionapproach(explainedinSection8.2).Thissectionfirstpresentsanoverviewofassociationruleminingmethodsingeneralandminingtemporalrulesinthemedicaldomain.Afterwards,itgivesdetailsontheproposedapproachtoex-tracttemporalrulesfrommulti-channelsofphysiologicaltimeseriesdata.Theoutputrulesforclinicalconditionsarethencomparedusingarulesimilaritymeasurethathighlightstheuniquenessoftherulesineachcondition.

8.1.1BackgroundonTemporalRuleMining

Temporalruleminingisapromisingapproachtogeneratemeaningfulassocia-tionrulesfromsequentialdata[197].Thissectionfirstdescribesthestandardassociationruleminingmethod.Thenitreviewsruleminingapproachesfortemporaldatainthemedicaldomain.

AssociationRuleDiscovery

SupposeI={i1,···,id}isasetofitems(e.g.alltheproductsinastore),andD={d1,···,dN}isatransactionaldatabasewithNtransactions(e.g.alltheshoppinglistsinayear).ThesupportofanitemsetA⊂IisthefrequencyoftheoccurrenceofAinallthetransactionsofD.ThestandardassociationruleminingprovidesasetofrulesinformofA⇒B.Inthisrule,AisantecedentandBistheconsequent,whicharedisjointitemsets.Generally,arulelikeA⇒B

meansiftheitemsofAoccurinatransactiondi,thentheitemsofBalsowillplausiblyappearindi.Typicalmeasurestoshowthestrengthofarulearesupport(sup)andconfidence(conf).Supportofaruleshowshowoftentheruleitemsetsoccurinthedatabase.Further,theconfidenceofruleA⇒B

determineshowfrequentlytheitemsetBoccursintransactionswhichcontainitemsetA.LetPD(A)betheprobabilityoftheoccurrenceofanitemsetAinD.Then,supportandconfidenceoftheruleA⇒Baredefinedas[220]:

sup(A⇒B)=PD(A∪B),(8.1)

conf(A⇒B)=pD(A|B)=sup(A⇒B)/pD(A)(8.2)

Theruleswithsufficientsupportandconfidencearetypicallyknownasstrongrules.Thevaluesofminsupandminconfarespecifiedasthethresholdsforstrongandmeaningfulrules.Associationruleswithlowsupportmayhave


8.1 Mining Temporal Rules in Physiological Sensor

Data

This section is dedicated to present temporal rule mining as a data-driven anal-ysis of the abstracted patterns. The generated set of rules captures the temporalrelationships between patterns of physiological data. This rule mining approachis presented in this thesis to be used later in a template-based linguistic descrip-tion approach (explained in Section 8.2). This section first presents an overviewof association rule mining methods in general and mining temporal rules in themedical domain. Afterwards, it gives details on the proposed approach to ex-tract temporal rules from multi-channels of physiological time series data. Theoutput rules for clinical conditions are then compared using a rule similaritymeasure that highlights the uniqueness of the rules in each condition.

8.1.1 Background on Temporal Rule Mining

Temporal rule mining is a promising approach to generate meaningful associa-tion rules from sequential data [197]. This section first describes the standardassociation rule mining method. Then it reviews rule mining approaches fortemporal data in the medical domain.

Association Rule Discovery

Suppose I = {i1,· · · , id} is a set of items (e.g. all the products in a store), andD = {d1,· · · ,dN} is a transactional database with N transactions (e.g. all theshopping lists in a year). The support of an itemset A ⊂ I is the frequency ofthe occurrence of A in all the transactions of D. The standard association rulemining provides a set of rules in form of A ⇒ B. In this rule, A is antecedent andB is the consequent, which are disjoint itemsets. Generally, a rule like A ⇒ B

means if the items of A occur in a transaction di, then the items of B alsowill plausibly appear in di. Typical measures to show the strength of a ruleare support (sup) and confidence (conf). Support of a rule shows how oftenthe rule itemsets occur in the database. Further, the confidence of rule A ⇒ B

determines how frequently the itemset B occurs in transactions which containitemset A. Let PD(A) be the probability of the occurrence of an itemset A in D.Then, support and confidence of the rule A ⇒ B are defined as [220]:

sup(A ⇒ B) = PD(A ∪ B), (8.1)

conf(A ⇒ B) = pD(A|B) = sup(A ⇒ B)/pD(A) (8.2)

The rules with sufficient support and confidence are typically known asstrong rules. The values of minsup and minconf are specified as the thresholdsfor strong and meaningful rules. Association rules with low support may have


8.1MiningTemporalRulesinPhysiologicalSensor

Data

Thissectionisdedicatedtopresenttemporalruleminingasadata-drivenanal-ysisoftheabstractedpatterns.Thegeneratedsetofrulescapturesthetemporalrelationshipsbetweenpatternsofphysiologicaldata.Thisruleminingapproachispresentedinthisthesistobeusedlaterinatemplate-basedlinguisticdescrip-tionapproach(explainedinSection8.2).Thissectionfirstpresentsanoverviewofassociationruleminingmethodsingeneralandminingtemporalrulesinthemedicaldomain.Afterwards,itgivesdetailsontheproposedapproachtoex-tracttemporalrulesfrommulti-channelsofphysiologicaltimeseriesdata.Theoutputrulesforclinicalconditionsarethencomparedusingarulesimilaritymeasurethathighlightstheuniquenessoftherulesineachcondition.

8.1.1BackgroundonTemporalRuleMining

Temporalruleminingisapromisingapproachtogeneratemeaningfulassocia-tionrulesfromsequentialdata[197].Thissectionfirstdescribesthestandardassociationruleminingmethod.Thenitreviewsruleminingapproachesfortemporaldatainthemedicaldomain.

AssociationRuleDiscovery

SupposeI={i1,···,id}isasetofitems(e.g.alltheproductsinastore),andD={d1,···,dN}isatransactionaldatabasewithNtransactions(e.g.alltheshoppinglistsinayear).ThesupportofanitemsetA⊂IisthefrequencyoftheoccurrenceofAinallthetransactionsofD.ThestandardassociationruleminingprovidesasetofrulesinformofA⇒B.Inthisrule,AisantecedentandBistheconsequent,whicharedisjointitemsets.Generally,arulelikeA⇒B

meansiftheitemsofAoccurinatransactiondi,thentheitemsofBalsowillplausiblyappearindi.Typicalmeasurestoshowthestrengthofarulearesupport(sup)andconfidence(conf).Supportofaruleshowshowoftentheruleitemsetsoccurinthedatabase.Further,theconfidenceofruleA⇒B

determineshowfrequentlytheitemsetBoccursintransactionswhichcontainitemsetA.LetPD(A)betheprobabilityoftheoccurrenceofanitemsetAinD.Then,supportandconfidenceoftheruleA⇒Baredefinedas[220]:

sup(A⇒B)=PD(A∪B),(8.1)

conf(A⇒B)=pD(A|B)=sup(A⇒B)/pD(A)(8.2)

Theruleswithsufficientsupportandconfidencearetypicallyknownasstrongrules.Thevaluesofminsupandminconfarespecifiedasthethresholdsforstrongandmeaningfulrules.Associationruleswithlowsupportmayhave

8.1. MINING TEMPORAL RULES 121

occurred accidentally that will not be interesting as significant rules. Similarly,a rule with low confidence cannot represent the frequent relations. Thus, thethresholds minsup and minconf given by the user can avoid involving the inef-fective rules in the result.

Temporal Relations in Association Rules

Several versions of association rule mining algorithms have been introducedto deal with non-transactional data which consist sequential items (i.e., timeseries) to give temporal rules [134]. These algorithms adapt the definition ofelements in association rules based on the time-stamped data to involve tem-poral constraints between the antecedent and the consequent of a rule. As in

the case of A T=⇒ B, which intends “If A happens, B will happen within time

T” [67]. Defining the temporal rules needs a reasonably good understanding oftime-dependent relations between the temporal observations (items) [193].

Based on the Allen’s temporal logic [18], 13 possible relationships betweeneach pair of temporal patterns can be specified. For association rule mining oftemporal data, the abstracted patterns from time series are defined as the items.Then the set of transactions in rule mining method is constructed by all thecombinations of Allen’s operations between temporal patterns. Then all thesecombinations of related temporal patterns from single or multivariate time se-ries build the set of transactions. For instance, suppose two time series t1 andt2, with the prototypical patterns Ct1 and Ct2 , also sequences of prototypicalpatterns P(t1) : p1· · ·pm and P(t2) : q1· · ·qm. To find the coincident rules be-tween t1 and t2, the set of items is I = Ct1 ∪ Ct2 , and the set of transactionsD is constructed with all pairs of di : (pi,qi) according to the ‘equal’ opera-tion (1 � i � m). The next step would be to apply the described associationrule mining algorithm to the provided transactions D and items I. The out-put of rule mining is a set of temporal rules R = {r1, r2,· · ·}, where each ruleri : A

ρ=⇒ B represents the repetitive relation of itemsets A and B along the

operation ρ, where A,B ⊂ I and ρ ∈ {‘equal’, ‘start’, ‘meet’, . . .}. Whilethese approaches have been applied to physiological data [62, 166], they lackcomparing the provided rule sets in various medical conditions. This compari-son can reveal valuable insights about the data. The approach presented in thischapter attempts to address this problem.

Temporal Rule Mining in Clinical Settings

Recently, temporal association rule mining methods have been applied on theclinical data stream to identify complex relationships of the physiological sen-sor observations. Sacchi et al. [192] presented a knowledge-based approach forrule mining from labelled temporal patterns in biomedical data. In [62], theauthors present temporal rule extraction for physiological data and address theproblem of visually analysing this kind of data. The study in [113] proposes a

8.1.MININGTEMPORALRULES121

occurredaccidentallythatwillnotbeinterestingassignificantrules.Similarly,arulewithlowconfidencecannotrepresentthefrequentrelations.Thus,thethresholdsminsupandminconfgivenbytheusercanavoidinvolvingtheinef-fectiverulesintheresult.

TemporalRelationsinAssociationRules

Severalversionsofassociationruleminingalgorithmshavebeenintroducedtodealwithnon-transactionaldatawhichconsistsequentialitems(i.e.,timeseries)togivetemporalrules[134].Thesealgorithmsadaptthedefinitionofelementsinassociationrulesbasedonthetime-stampeddatatoinvolvetem-poralconstraintsbetweentheantecedentandtheconsequentofarule.Asin

thecaseofAT=⇒B,whichintends“IfAhappens,Bwillhappenwithintime

T”[67].Definingthetemporalrulesneedsareasonablygoodunderstandingoftime-dependentrelationsbetweenthetemporalobservations(items)[193].

BasedontheAllen’stemporallogic[18],13possiblerelationshipsbetweeneachpairoftemporalpatternscanbespecified.Forassociationruleminingoftemporaldata,theabstractedpatternsfromtimeseriesaredefinedastheitems.ThenthesetoftransactionsinruleminingmethodisconstructedbyallthecombinationsofAllen’soperationsbetweentemporalpatterns.Thenallthesecombinationsofrelatedtemporalpatternsfromsingleormultivariatetimese-riesbuildthesetoftransactions.Forinstance,supposetwotimeseriest1andt2,withtheprototypicalpatternsCt1andCt2,alsosequencesofprototypicalpatternsP(t1):p1···pmandP(t2):q1···qm.Tofindthecoincidentrulesbe-tweent1andt2,thesetofitemsisI=Ct1∪Ct2,andthesetoftransactionsDisconstructedwithallpairsofdi:(pi,qi)accordingtothe‘equal’opera-tion(1�i�m).ThenextstepwouldbetoapplythedescribedassociationruleminingalgorithmtotheprovidedtransactionsDanditemsI.Theout-putofruleminingisasetoftemporalrulesR={r1,r2,···},whereeachruleri:A

ρ=⇒BrepresentstherepetitiverelationofitemsetsAandBalongthe

operationρ,whereA,B⊂Iandρ∈{‘equal’,‘start’,‘meet’,...}.Whiletheseapproacheshavebeenappliedtophysiologicaldata[62,166],theylackcomparingtheprovidedrulesetsinvariousmedicalconditions.Thiscompari-soncanrevealvaluableinsightsaboutthedata.Theapproachpresentedinthischapterattemptstoaddressthisproblem.

TemporalRuleMininginClinicalSettings

Recently,temporalassociationruleminingmethodshavebeenappliedontheclinicaldatastreamtoidentifycomplexrelationshipsofthephysiologicalsen-sorobservations.Sacchietal.[192]presentedaknowledge-basedapproachforruleminingfromlabelledtemporalpatternsinbiomedicaldata.In[62],theauthorspresenttemporalruleextractionforphysiologicaldataandaddresstheproblemofvisuallyanalysingthiskindofdata.Thestudyin[113]proposesa


occurred accidentally that will not be interesting as significant rules. Similarly,a rule with low confidence cannot represent the frequent relations. Thus, thethresholds minsup and minconf given by the user can avoid involving the inef-fective rules in the result.

Temporal Relations in Association Rules

Several versions of association rule mining algorithms have been introducedto deal with non-transactional data which consist sequential items (i.e., timeseries) to give temporal rules [134]. These algorithms adapt the definition ofelements in association rules based on the time-stamped data to involve tem-poral constraints between the antecedent and the consequent of a rule. As in

the case of A T=⇒ B, which intends “If A happens, B will happen within time

T” [67]. Defining the temporal rules needs a reasonably good understanding oftime-dependent relations between the temporal observations (items) [193].

Based on the Allen’s temporal logic [18], 13 possible relationships betweeneach pair of temporal patterns can be specified. For association rule mining oftemporal data, the abstracted patterns from time series are defined as the items.Then the set of transactions in rule mining method is constructed by all thecombinations of Allen’s operations between temporal patterns. Then all thesecombinations of related temporal patterns from single or multivariate time se-ries build the set of transactions. For instance, suppose two time series t1 andt2, with the prototypical patterns Ct1 and Ct2 , also sequences of prototypicalpatterns P(t1) : p1· · ·pm and P(t2) : q1· · ·qm. To find the coincident rules be-tween t1 and t2, the set of items is I = Ct1 ∪ Ct2 , and the set of transactionsD is constructed with all pairs of di : (pi,qi) according to the ‘equal’ opera-tion (1 � i � m). The next step would be to apply the described associationrule mining algorithm to the provided transactions D and items I. The out-put of rule mining is a set of temporal rules R = {r1, r2,· · ·}, where each ruleri : A

ρ=⇒ B represents the repetitive relation of itemsets A and B along the

operation ρ, where A,B ⊂ I and ρ ∈ {‘equal’, ‘start’, ‘meet’, . . .}. Whilethese approaches have been applied to physiological data [62, 166], they lackcomparing the provided rule sets in various medical conditions. This compari-son can reveal valuable insights about the data. The approach presented in thischapter attempts to address this problem.

Temporal Rule Mining in Clinical Settings

Recently, temporal association rule mining methods have been applied on theclinical data stream to identify complex relationships of the physiological sen-sor observations. Sacchi et al. [192] presented a knowledge-based approach forrule mining from labelled temporal patterns in biomedical data. In [62], theauthors present temporal rule extraction for physiological data and address theproblem of visually analysing this kind of data. The study in [113] proposes a


occurredaccidentallythatwillnotbeinterestingassignificantrules.Similarly,arulewithlowconfidencecannotrepresentthefrequentrelations.Thus,thethresholdsminsupandminconfgivenbytheusercanavoidinvolvingtheinef-fectiverulesintheresult.

TemporalRelationsinAssociationRules

Severalversionsofassociationruleminingalgorithmshavebeenintroducedtodealwithnon-transactionaldatawhichconsistsequentialitems(i.e.,timeseries)togivetemporalrules[134].Thesealgorithmsadaptthedefinitionofelementsinassociationrulesbasedonthetime-stampeddatatoinvolvetem-poralconstraintsbetweentheantecedentandtheconsequentofarule.Asin

thecaseofAT=⇒B,whichintends“IfAhappens,Bwillhappenwithintime

T”[67].Definingthetemporalrulesneedsareasonablygoodunderstandingoftime-dependentrelationsbetweenthetemporalobservations(items)[193].

BasedontheAllen’stemporallogic[18],13possiblerelationshipsbetweeneachpairoftemporalpatternscanbespecified.Forassociationruleminingoftemporaldata,theabstractedpatternsfromtimeseriesaredefinedastheitems.ThenthesetoftransactionsinruleminingmethodisconstructedbyallthecombinationsofAllen’soperationsbetweentemporalpatterns.Thenallthesecombinationsofrelatedtemporalpatternsfromsingleormultivariatetimese-riesbuildthesetoftransactions.Forinstance,supposetwotimeseriest1andt2,withtheprototypicalpatternsCt1andCt2,alsosequencesofprototypicalpatternsP(t1):p1···pmandP(t2):q1···qm.Tofindthecoincidentrulesbe-tweent1andt2,thesetofitemsisI=Ct1∪Ct2,andthesetoftransactionsDisconstructedwithallpairsofdi:(pi,qi)accordingtothe‘equal’opera-tion(1�i�m).ThenextstepwouldbetoapplythedescribedassociationruleminingalgorithmtotheprovidedtransactionsDanditemsI.Theout-putofruleminingisasetoftemporalrulesR={r1,r2,···},whereeachruleri:A

ρ=⇒BrepresentstherepetitiverelationofitemsetsAandBalongthe

operationρ,whereA,B⊂Iandρ∈{‘equal’,‘start’,‘meet’,...}.Whiletheseapproacheshavebeenappliedtophysiologicaldata[62,166],theylackcomparingtheprovidedrulesetsinvariousmedicalconditions.Thiscompari-soncanrevealvaluableinsightsaboutthedata.Theapproachpresentedinthischapterattemptstoaddressthisproblem.

TemporalRuleMininginClinicalSettings

Recently,temporalassociationruleminingmethodshavebeenappliedontheclinicaldatastreamtoidentifycomplexrelationshipsofthephysiologicalsen-sorobservations.Sacchietal.[192]presentedaknowledge-basedapproachforruleminingfromlabelledtemporalpatternsinbiomedicaldata.In[62],theauthorspresenttemporalruleextractionforphysiologicaldataandaddresstheproblemofvisuallyanalysingthiskindofdata.Thestudyin[113]proposesa


novel multivariate association rule mining based on change detection for com-plex data sets including numerical data streams. The authors in [157] introducean approach to generate the rules automatically from the linguistic data of coro-nary heart disease using subtractive clustering and fuzzy inference to determinethe diagnosis. In [20], a temporal technique for discovering frequent temporalpatterns is proposed to extract well-known patterns of sleep apnea-hypopneasyndrome.

Although these systems have used rule mining techniques for health moni-toring, none of them has focused on modelling the individual behaviours, alongwith a descriptive approach to represent the output of the system (i.e. generatedrules). Also, those are mostly dependent on the initial knowledge provided bythe user.

8.1.2 A New Approach for Temporal Rule Mining

To apply an association rule mining approach on each clinical condition fromthe MIMIC data set, all the selected records of the subjects with the same con-dition are accumulated to be analysed together. In this way, a more significantamount of data is involved in the process of rule mining, which leads to havingmore robust rules for each clinical condition. Recall from Chapter 7, the con-sidered measurements of physiological data for each condition includes threevariables HR, BP, and RR. Suppose three time series tHR, tBP, and tRR, withthe same length of n corresponded to the measurements in HR, BP, and RR,respectively.

The sequences of patterns P(tHR), P(tBP), and P(tRR), with size of m, areobtained from the prototypical pattern abstraction, explained in Section 7.3.Association rule mining is a suitable approach to discover the coherence rela-tions between the patterns occurred among the multi-variables. In this work,the focus is on the association rules between two pairs of physiological timeseries, i.e. heart rate with blood pressure (HR&BP), and heart rate with respi-ration rate (HR&RR), although, more compound relations are also applicableby applying complex temporal abstraction techniques [201]. As discussed inSection 8.1, the main requirement for association rule mining is to identify theset of items I and the set of transactions D. While considering the relation ofHR and BP patterns, the set of items I includes all the prototypical patterns inboth CtHR

and CtBP.

Different temporal relations can be defined on the discovered patterns tospecify the transactions, but in this study, a modified set of relations betweenthe patterns in physiological data is specified. Consider two multivariate signalsHR and BP with the sequences of patterns P(tHR) and P(tBP), respectively. LetP1 represents one pattern in P(tHR) and P2 represents at most two patterns inP(tBP). Three temporal relations between P1 and P2 are considered: ‘P1 equalsP2’, ‘P1 before P2’, and ‘P1 after P2’. It is worth noting that further temporalrelations such as meets and overlaps are only slightly different with before and


novelmultivariateassociationruleminingbasedonchangedetectionforcom-plexdatasetsincludingnumericaldatastreams.Theauthorsin[157]introduceanapproachtogeneratetherulesautomaticallyfromthelinguisticdataofcoro-naryheartdiseaseusingsubtractiveclusteringandfuzzyinferencetodeterminethediagnosis.In[20],atemporaltechniquefordiscoveringfrequenttemporalpatternsisproposedtoextractwell-knownpatternsofsleepapnea-hypopneasyndrome.

Althoughthesesystemshaveusedruleminingtechniquesforhealthmoni-toring,noneofthemhasfocusedonmodellingtheindividualbehaviours,alongwithadescriptiveapproachtorepresenttheoutputofthesystem(i.e.generatedrules).Also,thosearemostlydependentontheinitialknowledgeprovidedbytheuser.

8.1.2ANewApproachforTemporalRuleMining

ToapplyanassociationruleminingapproachoneachclinicalconditionfromtheMIMICdataset,alltheselectedrecordsofthesubjectswiththesamecon-ditionareaccumulatedtobeanalysedtogether.Inthisway,amoresignificantamountofdataisinvolvedintheprocessofrulemining,whichleadstohavingmorerobustrulesforeachclinicalcondition.RecallfromChapter7,thecon-sideredmeasurementsofphysiologicaldataforeachconditionincludesthreevariablesHR,BP,andRR.SupposethreetimeseriestHR,tBP,andtRR,withthesamelengthofncorrespondedtothemeasurementsinHR,BP,andRR,respectively.

ThesequencesofpatternsP(tHR),P(tBP),andP(tRR),withsizeofm,areobtainedfromtheprototypicalpatternabstraction,explainedinSection7.3.Associationruleminingisasuitableapproachtodiscoverthecoherencerela-tionsbetweenthepatternsoccurredamongthemulti-variables.Inthiswork,thefocusisontheassociationrulesbetweentwopairsofphysiologicaltimeseries,i.e.heartratewithbloodpressure(HR&BP),andheartratewithrespi-rationrate(HR&RR),although,morecompoundrelationsarealsoapplicablebyapplyingcomplextemporalabstractiontechniques[201].AsdiscussedinSection8.1,themainrequirementforassociationruleminingistoidentifythesetofitemsIandthesetoftransactionsD.WhileconsideringtherelationofHRandBPpatterns,thesetofitemsIincludesalltheprototypicalpatternsinbothCtHRandCtBP.

Differenttemporalrelationscanbedefinedonthediscoveredpatternstospecifythetransactions,butinthisstudy,amodifiedsetofrelationsbetweenthepatternsinphysiologicaldataisspecified.ConsidertwomultivariatesignalsHRandBPwiththesequencesofpatternsP(tHR)andP(tBP),respectively.LetP1representsonepatterninP(tHR)andP2representsatmosttwopatternsinP(tBP).ThreetemporalrelationsbetweenP1andP2areconsidered:‘P1equalsP2’,‘P1beforeP2’,and‘P1afterP2’.Itisworthnotingthatfurthertemporalrelationssuchasmeetsandoverlapsareonlyslightlydifferentwithbeforeand


novel multivariate association rule mining based on change detection for com-plex data sets including numerical data streams. The authors in [157] introducean approach to generate the rules automatically from the linguistic data of coro-nary heart disease using subtractive clustering and fuzzy inference to determinethe diagnosis. In [20], a temporal technique for discovering frequent temporalpatterns is proposed to extract well-known patterns of sleep apnea-hypopneasyndrome.

Although these systems have used rule mining techniques for health moni-toring, none of them has focused on modelling the individual behaviours, alongwith a descriptive approach to represent the output of the system (i.e. generatedrules). Also, those are mostly dependent on the initial knowledge provided bythe user.

8.1.2 A New Approach for Temporal Rule Mining

To apply an association rule mining approach on each clinical condition fromthe MIMIC data set, all the selected records of the subjects with the same con-dition are accumulated to be analysed together. In this way, a more significantamount of data is involved in the process of rule mining, which leads to havingmore robust rules for each clinical condition. Recall from Chapter 7, the con-sidered measurements of physiological data for each condition includes threevariables HR, BP, and RR. Suppose three time series tHR, tBP, and tRR, withthe same length of n corresponded to the measurements in HR, BP, and RR,respectively.

The sequences of patterns P(tHR), P(tBP), and P(tRR), with size of m, areobtained from the prototypical pattern abstraction, explained in Section 7.3.Association rule mining is a suitable approach to discover the coherence rela-tions between the patterns occurred among the multi-variables. In this work,the focus is on the association rules between two pairs of physiological timeseries, i.e. heart rate with blood pressure (HR&BP), and heart rate with respi-ration rate (HR&RR), although, more compound relations are also applicableby applying complex temporal abstraction techniques [201]. As discussed inSection 8.1, the main requirement for association rule mining is to identify theset of items I and the set of transactions D. While considering the relation ofHR and BP patterns, the set of items I includes all the prototypical patterns inboth CtHR

and CtBP.

Different temporal relations can be defined on the discovered patterns tospecify the transactions, but in this study, a modified set of relations betweenthe patterns in physiological data is specified. Consider two multivariate signalsHR and BP with the sequences of patterns P(tHR) and P(tBP), respectively. LetP1 represents one pattern in P(tHR) and P2 represents at most two patterns inP(tBP). Three temporal relations between P1 and P2 are considered: ‘P1 equalsP2’, ‘P1 before P2’, and ‘P1 after P2’. It is worth noting that further temporalrelations such as meets and overlaps are only slightly different with before and


novelmultivariateassociationruleminingbasedonchangedetectionforcom-plexdatasetsincludingnumericaldatastreams.Theauthorsin[157]introduceanapproachtogeneratetherulesautomaticallyfromthelinguisticdataofcoro-naryheartdiseaseusingsubtractiveclusteringandfuzzyinferencetodeterminethediagnosis.In[20],atemporaltechniquefordiscoveringfrequenttemporalpatternsisproposedtoextractwell-knownpatternsofsleepapnea-hypopneasyndrome.

Althoughthesesystemshaveusedruleminingtechniquesforhealthmoni-toring,noneofthemhasfocusedonmodellingtheindividualbehaviours,alongwithadescriptiveapproachtorepresenttheoutputofthesystem(i.e.generatedrules).Also,thosearemostlydependentontheinitialknowledgeprovidedbytheuser.

8.1.2ANewApproachforTemporalRuleMining

ToapplyanassociationruleminingapproachoneachclinicalconditionfromtheMIMICdataset,alltheselectedrecordsofthesubjectswiththesamecon-ditionareaccumulatedtobeanalysedtogether.Inthisway,amoresignificantamountofdataisinvolvedintheprocessofrulemining,whichleadstohavingmorerobustrulesforeachclinicalcondition.RecallfromChapter7,thecon-sideredmeasurementsofphysiologicaldataforeachconditionincludesthreevariablesHR,BP,andRR.SupposethreetimeseriestHR,tBP,andtRR,withthesamelengthofncorrespondedtothemeasurementsinHR,BP,andRR,respectively.

ThesequencesofpatternsP(tHR),P(tBP),andP(tRR),withsizeofm,areobtainedfromtheprototypicalpatternabstraction,explainedinSection7.3.Associationruleminingisasuitableapproachtodiscoverthecoherencerela-tionsbetweenthepatternsoccurredamongthemulti-variables.Inthiswork,thefocusisontheassociationrulesbetweentwopairsofphysiologicaltimeseries,i.e.heartratewithbloodpressure(HR&BP),andheartratewithrespi-rationrate(HR&RR),although,morecompoundrelationsarealsoapplicablebyapplyingcomplextemporalabstractiontechniques[201].AsdiscussedinSection8.1,themainrequirementforassociationruleminingistoidentifythesetofitemsIandthesetoftransactionsD.WhileconsideringtherelationofHRandBPpatterns,thesetofitemsIincludesalltheprototypicalpatternsinbothCtHRandCtBP.

Differenttemporalrelationscanbedefinedonthediscoveredpatternstospecifythetransactions,butinthisstudy,amodifiedsetofrelationsbetweenthepatternsinphysiologicaldataisspecified.ConsidertwomultivariatesignalsHRandBPwiththesequencesofpatternsP(tHR)andP(tBP),respectively.LetP1representsonepatterninP(tHR)andP2representsatmosttwopatternsinP(tBP).ThreetemporalrelationsbetweenP1andP2areconsidered:‘P1equalsP2’,‘P1beforeP2’,and‘P1afterP2’.Itisworthnotingthatfurthertemporalrelationssuchasmeetsandoverlapsareonlyslightlydifferentwithbeforeand


Figure 8.1: Three temporal relations between one pattern P1 in P(tHR) and atmost two patterns P2 in P(tBP).

after relations. Likewise, the relations during, starts, and finishes are mostlycovered by equals relation. Thus, for all the patterns in the sequences P(tHR)and P(tBP), three corresponding transactions for ‘equals’, ‘before’, and ‘after’relations are defined as follows:

pi equals qi : dei = (pi, qi), (8.3)

pi before qi+1, qi+2 : dbi = (pi, qi+1), (pi, qi+2), (8.4)

pi after qi−1, qi−2 : dai = (pi, qi−1), (pi, qi−2), (8.5)

where pi ∈ P(tHR) and qi ∈ P(tBP). Note that the relation ‘P1 before P2’is equivalent to the relation ‘P2 after P1’, which means the opposite relationsbetween BP and HR are also covered by this definition. Figure 8.1 shows therelational positions of patterns pi, and qi−2· · ·qi+2 in their corresponding pat-tern sequences for three defined temporal relations.

The Apriori algorithm introduced in [11] is an efficient algorithm for as-sociation rule mining from a set of transactions D, which initialises all possi-ble itemsets from the items I and then generates a set of sufficient rules likeA ⇒ B based on the co-occurrence of A and B in the transactions. This algo-rithm is based on the symbolic order of items, which can destroy the tem-poral relations in sequential data. However, in this approach, the temporalrelations of the patterns are implicitly specified in the definition of the intro-duced transactions. In other words, the Apriori algorithm is applied, but withan adapted set of transactions Dρ including defined temporal relations as theiritems: Dρ = {de

i ,dbi ,da

i | 1 � i � |P(tHR)|}. Using the defined set of temporaltransactions and the set of items (prototypical patterns), the generated rule isformulated as: r : A

ρ=⇒ B, where the antecedent (A) and consequent (B) can

be any of two subsequences P1 and P2 that are co-occurred in Dρ. The rule r

also expresses additional information about the temporal relation ρ between p

and q, such that ρ ∈ {‘equal’, ‘before’, ‘after’}. Applying the Apriori algorithm


Figure8.1:ThreetemporalrelationsbetweenonepatternP1inP(tHR)andatmosttwopatternsP2inP(tBP).

afterrelations.Likewise,therelationsduring,starts,andfinishesaremostlycoveredbyequalsrelation.Thus,forallthepatternsinthesequencesP(tHR)andP(tBP),threecorrespondingtransactionsfor‘equals’,‘before’,and‘after’relationsaredefinedasfollows:

piequalsqi:dei=(pi,qi),(8.3)

pibeforeqi+1,qi+2:dbi=(pi,qi+1),(pi,qi+2),(8.4)

piafterqi−1,qi−2:dai=(pi,qi−1),(pi,qi−2),(8.5)

wherepi∈P(tHR)andqi∈P(tBP).Notethattherelation‘P1beforeP2’isequivalenttotherelation‘P2afterP1’,whichmeanstheoppositerelationsbetweenBPandHRarealsocoveredbythisdefinition.Figure8.1showstherelationalpositionsofpatternspi,andqi−2···qi+2intheircorrespondingpat-ternsequencesforthreedefinedtemporalrelations.

TheApriorialgorithmintroducedin[11]isanefficientalgorithmforas-sociationruleminingfromasetoftransactionsD,whichinitialisesallpossi-bleitemsetsfromtheitemsIandthengeneratesasetofsufficientruleslikeA⇒Bbasedontheco-occurrenceofAandBinthetransactions.Thisalgo-rithmisbasedonthesymbolicorderofitems,whichcandestroythetem-poralrelationsinsequentialdata.However,inthisapproach,thetemporalrelationsofthepatternsareimplicitlyspecifiedinthedefinitionoftheintro-ducedtransactions.Inotherwords,theApriorialgorithmisapplied,butwithanadaptedsetoftransactionsD

ρincludingdefinedtemporalrelationsastheir

items:Dρ={d

ei,d

bi,d

ai|1�i�|P(tHR)|}.Usingthedefinedsetoftemporal

transactionsandthesetofitems(prototypicalpatterns),thegeneratedruleisformulatedas:r:A

ρ=⇒B,wheretheantecedent(A)andconsequent(B)can

beanyoftwosubsequencesP1andP2thatareco-occurredinDρ.Theruler

alsoexpressesadditionalinformationaboutthetemporalrelationρbetweenp

andq,suchthatρ∈{‘equal’,‘before’,‘after’}.ApplyingtheApriorialgorithm


Figure 8.1: Three temporal relations between one pattern P1 in P(tHR) and atmost two patterns P2 in P(tBP).

after relations. Likewise, the relations during, starts, and finishes are mostlycovered by equals relation. Thus, for all the patterns in the sequences P(tHR)and P(tBP), three corresponding transactions for ‘equals’, ‘before’, and ‘after’relations are defined as follows:

pi equals qi : dei = (pi, qi), (8.3)

pi before qi+1, qi+2 : dbi = (pi, qi+1), (pi, qi+2), (8.4)

pi after qi−1, qi−2 : dai = (pi, qi−1), (pi, qi−2), (8.5)

where pi ∈ P(tHR) and qi ∈ P(tBP). Note that the relation ‘P1 before P2’is equivalent to the relation ‘P2 after P1’, which means the opposite relationsbetween BP and HR are also covered by this definition. Figure 8.1 shows therelational positions of patterns pi, and qi−2· · ·qi+2 in their corresponding pat-tern sequences for three defined temporal relations.

The Apriori algorithm introduced in [11] is an efficient algorithm for as-sociation rule mining from a set of transactions D, which initialises all possi-ble itemsets from the items I and then generates a set of sufficient rules likeA ⇒ B based on the co-occurrence of A and B in the transactions. This algo-rithm is based on the symbolic order of items, which can destroy the tem-poral relations in sequential data. However, in this approach, the temporalrelations of the patterns are implicitly specified in the definition of the intro-duced transactions. In other words, the Apriori algorithm is applied, but withan adapted set of transactions Dρ including defined temporal relations as theiritems: Dρ = {de

i ,dbi ,da

i | 1 � i � |P(tHR)|}. Using the defined set of temporaltransactions and the set of items (prototypical patterns), the generated rule isformulated as: r : A

ρ=⇒ B, where the antecedent (A) and consequent (B) can

be any of two subsequences P1 and P2 that are co-occurred in Dρ. The rule r

also expresses additional information about the temporal relation ρ between p

and q, such that ρ ∈ {‘equal’, ‘before’, ‘after’}. Applying the Apriori algorithm


Figure8.1:ThreetemporalrelationsbetweenonepatternP1inP(tHR)andatmosttwopatternsP2inP(tBP).

afterrelations.Likewise,therelationsduring,starts,andfinishesaremostlycoveredbyequalsrelation.Thus,forallthepatternsinthesequencesP(tHR)andP(tBP),threecorrespondingtransactionsfor‘equals’,‘before’,and‘after’relationsaredefinedasfollows:

piequalsqi:dei=(pi,qi),(8.3)

pibeforeqi+1,qi+2:dbi=(pi,qi+1),(pi,qi+2),(8.4)

piafterqi−1,qi−2:dai=(pi,qi−1),(pi,qi−2),(8.5)

wherepi∈P(tHR)andqi∈P(tBP).Notethattherelation‘P1beforeP2’isequivalenttotherelation‘P2afterP1’,whichmeanstheoppositerelationsbetweenBPandHRarealsocoveredbythisdefinition.Figure8.1showstherelationalpositionsofpatternspi,andqi−2···qi+2intheircorrespondingpat-ternsequencesforthreedefinedtemporalrelations.

TheApriorialgorithmintroducedin[11]isanefficientalgorithmforas-sociationruleminingfromasetoftransactionsD,whichinitialisesallpossi-bleitemsetsfromtheitemsIandthengeneratesasetofsufficientruleslikeA⇒Bbasedontheco-occurrenceofAandBinthetransactions.Thisalgo-rithmisbasedonthesymbolicorderofitems,whichcandestroythetem-poralrelationsinsequentialdata.However,inthisapproach,thetemporalrelationsofthepatternsareimplicitlyspecifiedinthedefinitionoftheintro-ducedtransactions.Inotherwords,theApriorialgorithmisapplied,butwithanadaptedsetoftransactionsD

ρincludingdefinedtemporalrelationsastheir

items:Dρ={d

ei,d

bi,d

ai|1�i�|P(tHR)|}.Usingthedefinedsetoftemporal

transactionsandthesetofitems(prototypicalpatterns),thegeneratedruleisformulatedas:r:A

ρ=⇒B,wheretheantecedent(A)andconsequent(B)can

beanyoftwosubsequencesP1andP2thatareco-occurredinDρ.Theruler

alsoexpressesadditionalinformationaboutthetemporalrelationρbetweenp

andq,suchthatρ∈{‘equal’,‘before’,‘after’}.ApplyingtheApriorialgorithm


with accurate values for minsup and minconf leads to have a set of temporalrules R = {r1, r2,· · · , rn} as a result. This rule set consists of the main repetitiverelations of physiological data in sensor observations.

8.1.3 Temporal Rule Set Similarity

In this work, a similarity function is proposed to compute a ratio between thenumber of rules from one rule set which occur in another rule set. The providedtemporal rules can be extended to represent the individual behaviour of vitalsigns in a given condition. This similarity function can evaluate the distinctionbetween temporal rule sets. Suppose there are two rule sets R1 = {r1,· · · , rn1 }

and R2 = {r1,· · · , rn2 } including m and n rules, respectively. The overlappingratio of rule sets is a primary measure to investigate the characteristic propertiesof rule sets with the same sets of items [76]. The overlapping ratio as a similarityfunction between a pair of rule sets is typically defined as:

Overlap(R1,R2) = |R1 ∩ R2| / |R1 ∪ R2|. (8.6)

In a standard rule association mining with a constant database of items,counting the intersection of the rules in R1 and R2 is straightforward, since itis easy to check the equivalence of rules. Two rules ri : A ⇒ B and rj:C⇒D

are equivalent if their corresponding itemsets are equal: A = C and B = D.However, the main issue with temporal rule sets produced by this approach isthat the sets of items in different rule sets are entirely distinct. In other words,for different cases, there are different sets of prototypical patterns (items), andconsequently different itemsets in the final rules. Thus, finding the overlap oftemporal rule sets using Equation 8.6 utilising this function is not informative.Here, an alternative solution is proposed.

Occurrence Ratio

Suppose IR1 and IR2 are the most likely distinct sets of items (patterns) forthe temporal rule sets R1 and R2, respectively. To find the equivalent rule tori : A

ρ=⇒ B ∈ R1 in rule set R2 (if exists), the approach searches for the most

analogous rule r ′j : A ′ ρ′=⇒ B ′ ∈ R2 which is sufficiently similar to ri. Here,

the similarity of itemsets is measured through the use of the pattern matchingalgorithms to find the best-matched patterns [59]. If r ′j exists, then one overlapis found between R1 and R2, which means A ≈ A ′, B ≈ B ′, and ρ = ρ ′.It is notable that the corresponding itemsets need to be approximately equal,whereas, the temporal relations have to be the same.

Searching for the occurrence of one rule in a rule set is presented in Algo-rithm 8.1. This algorithm shows how to find the most analogous rule from R

to an input rule r (with the assumption of r/∈R). If such a matched rule rm canbe detected in R, it derives that the rule r most likely appears in R as well.


withaccuratevaluesforminsupandminconfleadstohaveasetoftemporalrulesR={r1,r2,···,rn}asaresult.Thisrulesetconsistsofthemainrepetitiverelationsofphysiologicaldatainsensorobservations.

8.1.3TemporalRuleSetSimilarity

Inthiswork,asimilarityfunctionisproposedtocomputearatiobetweenthenumberofrulesfromonerulesetwhichoccurinanotherruleset.Theprovidedtemporalrulescanbeextendedtorepresenttheindividualbehaviourofvitalsignsinagivencondition.Thissimilarityfunctioncanevaluatethedistinctionbetweentemporalrulesets.SupposetherearetworulesetsR1={r1,···,rn1}

andR2={r1,···,rn2}includingmandnrules,respectively.Theoverlappingratioofrulesetsisaprimarymeasuretoinvestigatethecharacteristicpropertiesofrulesetswiththesamesetsofitems[76].Theoverlappingratioasasimilarityfunctionbetweenapairofrulesetsistypicallydefinedas:

Overlap(R1,R2)=|R1∩R2|/|R1∪R2|.(8.6)

Inastandardruleassociationminingwithaconstantdatabaseofitems,countingtheintersectionoftherulesinR1andR2isstraightforward,sinceitiseasytochecktheequivalenceofrules.Tworulesri:A⇒Bandrj:C⇒D

areequivalentiftheircorrespondingitemsetsareequal:A=CandB=D.However,themainissuewithtemporalrulesetsproducedbythisapproachisthatthesetsofitemsindifferentrulesetsareentirelydistinct.Inotherwords,fordifferentcases,therearedifferentsetsofprototypicalpatterns(items),andconsequentlydifferentitemsetsinthefinalrules.Thus,findingtheoverlapoftemporalrulesetsusingEquation8.6utilisingthisfunctionisnotinformative.Here,analternativesolutionisproposed.

OccurrenceRatio

SupposeIR1andIR2arethemostlikelydistinctsetsofitems(patterns)forthetemporalrulesetsR1andR2,respectively.Tofindtheequivalentruletori:A

ρ=⇒B∈R1inrulesetR2(ifexists),theapproachsearchesforthemost

analogousruler′j:A′ρ′

=⇒B′∈R2whichissufficientlysimilartori.Here,thesimilarityofitemsetsismeasuredthroughtheuseofthepatternmatchingalgorithmstofindthebest-matchedpatterns[59].Ifr′

jexists,thenoneoverlapisfoundbetweenR1andR2,whichmeansA≈A′,B≈B′,andρ=ρ′.Itisnotablethatthecorrespondingitemsetsneedtobeapproximatelyequal,whereas,thetemporalrelationshavetobethesame.

SearchingfortheoccurrenceofoneruleinarulesetispresentedinAlgo-rithm8.1.ThisalgorithmshowshowtofindthemostanalogousrulefromR

toaninputruler(withtheassumptionofr/∈R).IfsuchamatchedrulermcanbedetectedinR,itderivesthattherulermostlikelyappearsinRaswell.


with accurate values for minsup and minconf leads to have a set of temporalrules R = {r1, r2,· · · , rn} as a result. This rule set consists of the main repetitiverelations of physiological data in sensor observations.

8.1.3 Temporal Rule Set Similarity

In this work, a similarity function is proposed to compute a ratio between thenumber of rules from one rule set which occur in another rule set. The providedtemporal rules can be extended to represent the individual behaviour of vitalsigns in a given condition. This similarity function can evaluate the distinctionbetween temporal rule sets. Suppose there are two rule sets R1 = {r1,· · · , rn1 }

and R2 = {r1,· · · , rn2 } including m and n rules, respectively. The overlappingratio of rule sets is a primary measure to investigate the characteristic propertiesof rule sets with the same sets of items [76]. The overlapping ratio as a similarityfunction between a pair of rule sets is typically defined as:

Overlap(R1,R2) = |R1 ∩ R2| / |R1 ∪ R2|. (8.6)

In a standard rule association mining with a constant database of items,counting the intersection of the rules in R1 and R2 is straightforward, since itis easy to check the equivalence of rules. Two rules ri : A ⇒ B and rj:C⇒D

are equivalent if their corresponding itemsets are equal: A = C and B = D.However, the main issue with temporal rule sets produced by this approach isthat the sets of items in different rule sets are entirely distinct. In other words,for different cases, there are different sets of prototypical patterns (items), andconsequently different itemsets in the final rules. Thus, finding the overlap oftemporal rule sets using Equation 8.6 utilising this function is not informative.Here, an alternative solution is proposed.

Occurrence Ratio

Suppose IR1 and IR2 are the most likely distinct sets of items (patterns) forthe temporal rule sets R1 and R2, respectively. To find the equivalent rule tori : A

ρ=⇒ B ∈ R1 in rule set R2 (if exists), the approach searches for the most

analogous rule r ′j : A ′ ρ′=⇒ B ′ ∈ R2 which is sufficiently similar to ri. Here,

the similarity of itemsets is measured through the use of the pattern matchingalgorithms to find the best-matched patterns [59]. If r ′j exists, then one overlapis found between R1 and R2, which means A ≈ A ′, B ≈ B ′, and ρ = ρ ′.It is notable that the corresponding itemsets need to be approximately equal,whereas, the temporal relations have to be the same.

Searching for the occurrence of one rule in a rule set is presented in Algo-rithm 8.1. This algorithm shows how to find the most analogous rule from R

to an input rule r (with the assumption of r/∈R). If such a matched rule rm canbe detected in R, it derives that the rule r most likely appears in R as well.


withaccuratevaluesforminsupandminconfleadstohaveasetoftemporalrulesR={r1,r2,···,rn}asaresult.Thisrulesetconsistsofthemainrepetitiverelationsofphysiologicaldatainsensorobservations.

8.1.3TemporalRuleSetSimilarity

Inthiswork,asimilarityfunctionisproposedtocomputearatiobetweenthenumberofrulesfromonerulesetwhichoccurinanotherruleset.Theprovidedtemporalrulescanbeextendedtorepresenttheindividualbehaviourofvitalsignsinagivencondition.Thissimilarityfunctioncanevaluatethedistinctionbetweentemporalrulesets.SupposetherearetworulesetsR1={r1,···,rn1}

andR2={r1,···,rn2}includingmandnrules,respectively.Theoverlappingratioofrulesetsisaprimarymeasuretoinvestigatethecharacteristicpropertiesofrulesetswiththesamesetsofitems[76].Theoverlappingratioasasimilarityfunctionbetweenapairofrulesetsistypicallydefinedas:

Overlap(R1,R2)=|R1∩R2|/|R1∪R2|.(8.6)

Inastandardruleassociationminingwithaconstantdatabaseofitems,countingtheintersectionoftherulesinR1andR2isstraightforward,sinceitiseasytochecktheequivalenceofrules.Tworulesri:A⇒Bandrj:C⇒D

areequivalentiftheircorrespondingitemsetsareequal:A=CandB=D.However,themainissuewithtemporalrulesetsproducedbythisapproachisthatthesetsofitemsindifferentrulesetsareentirelydistinct.Inotherwords,fordifferentcases,therearedifferentsetsofprototypicalpatterns(items),andconsequentlydifferentitemsetsinthefinalrules.Thus,findingtheoverlapoftemporalrulesetsusingEquation8.6utilisingthisfunctionisnotinformative.Here,analternativesolutionisproposed.

OccurrenceRatio

SupposeIR1andIR2arethemostlikelydistinctsetsofitems(patterns)forthetemporalrulesetsR1andR2,respectively.Tofindtheequivalentruletori:A

ρ=⇒B∈R1inrulesetR2(ifexists),theapproachsearchesforthemost

analogousruler′j:A′ρ′

=⇒B′∈R2whichissufficientlysimilartori.Here,thesimilarityofitemsetsismeasuredthroughtheuseofthepatternmatchingalgorithmstofindthebest-matchedpatterns[59].Ifr′

jexists,thenoneoverlapisfoundbetweenR1andR2,whichmeansA≈A′,B≈B′,andρ=ρ′.Itisnotablethatthecorrespondingitemsetsneedtobeapproximatelyequal,whereas,thetemporalrelationshavetobethesame.

SearchingfortheoccurrenceofoneruleinarulesetispresentedinAlgo-rithm8.1.ThisalgorithmshowshowtofindthemostanalogousrulefromR

toaninputruler(withtheassumptionofr/∈R).IfsuchamatchedrulermcanbedetectedinR,itderivesthattherulermostlikelyappearsinRaswell.


Algorithm 8.1: RuleMatch(r,R, IR)

Data: r : Aρ=⇒ B, R = {r1,· · · , rn} with set of items IR, (r /∈ R).

Result: rm : Amρ=⇒ Bm, where rm ∈ R and Am,Bm ⊂ IR.

Am ← best patterns matched to A from IR;Bm ← best patterns matched to B from IR;rm ← Am

ρ=⇒ Bm;

foreach ri : Aiρi=⇒ Bi ∈ R do

if ri = rm (Ai = Am & Bi = Bm & ρi = ρ) thenreturn rm;

return ∅; //rule not found

Algorithm 8.2: Occurrence(R1, R2, IR2 )

Data: Rule set R1, and rule set R2 with set of items IR2 .Result: ratioR1inR2 : Occurrence ratio of R1 in R2.weightR1inR2 ← 0;WeightR2 ← 0;foreach ri ∈ R1 do

r ′ ← RuleMatch(ri,R2, IR2 );if r ′ = ∅ then

weightR1inR2 ← weightR1inR2 + supR2(r′)× confR2(r

′);

foreach rj ∈ R2 doWeightR2 ← WeightR2 + supR2(rj)× confR2(rj);

ratioR1inR2 ← weightR1inR2/WeightR2 ;return ratioR1inR2

The next step is to measure how strong a rule occurs in another rule set. Themethod for checking the occurrence of a rule in another rule set leads to definea non-symmetric similarity measure, called Occurrence R1(R2), the occurrenceratio of R1 in R2 (previously called Appearance ratio in [30]). This measurerepresents how often rules with high support and confidence that appear in R1

also occur in R2, considering their strength in R2. It means that while findingthe closest rules of R2 to the rules in R1, the values of support and confidenceof matched rules are also considered in the occurrence ratio.

Algorithm 8.2 presents computing the occurrence ratio measure, which isscaled by the summation on the support and confidence of the rules in R2.Evidently, if the occurrence ratio of a rule set in another is considerably high, itshows these two rule sets are meaningfully associated. In contrast, if the ratiois considerably low, it indicates of few connections between rule sets, in a sensethat two rule sets are distinct.


Algorithm8.1:RuleMatch(r,R,IR)

Data:r:Aρ=⇒B,R={r1,···,rn}withsetofitemsIR,(r/∈R).

Result:rm:Amρ=⇒Bm,whererm∈RandAm,Bm⊂IR.

Am←bestpatternsmatchedtoAfromIR;Bm←bestpatternsmatchedtoBfromIR;rm←Am

ρ=⇒Bm;

foreachri:Aiρi=⇒Bi∈Rdo

ifri=rm(Ai=Am&Bi=Bm&ρi=ρ)thenreturnrm;

return∅;//rulenotfound

Algorithm8.2:Occurrence(R1,R2,IR2)

Data:RulesetR1,andrulesetR2withsetofitemsIR2.Result:ratioR1inR2:OccurrenceratioofR1inR2.weightR1inR2←0;WeightR2←0;foreachri∈R1do

r′←RuleMatch(ri,R2,IR2);ifr′=∅then

weightR1inR2←weightR1inR2+supR2(r′)×confR2(r′);

foreachrj∈R2doWeightR2←WeightR2+supR2(rj)×confR2(rj);

ratioR1inR2←weightR1inR2/WeightR2;returnratioR1inR2

Thenextstepistomeasurehowstrongaruleoccursinanotherruleset.Themethodforcheckingtheoccurrenceofaruleinanotherrulesetleadstodefineanon-symmetricsimilaritymeasure,calledOccurrenceR1(R2),theoccurrenceratioofR1inR2(previouslycalledAppearanceratioin[30]).ThismeasurerepresentshowoftenruleswithhighsupportandconfidencethatappearinR1

alsooccurinR2,consideringtheirstrengthinR2.ItmeansthatwhilefindingtheclosestrulesofR2totherulesinR1,thevaluesofsupportandconfidenceofmatchedrulesarealsoconsideredintheoccurrenceratio.

Algorithm8.2presentscomputingtheoccurrenceratiomeasure,whichisscaledbythesummationonthesupportandconfidenceoftherulesinR2.Evidently,iftheoccurrenceratioofarulesetinanotherisconsiderablyhigh,itshowsthesetworulesetsaremeaningfullyassociated.Incontrast,iftheratioisconsiderablylow,itindicatesoffewconnectionsbetweenrulesets,inasensethattworulesetsaredistinct.


Algorithm 8.1: RuleMatch(r,R, IR)

Data: r : Aρ=⇒ B, R = {r1,· · · , rn} with set of items IR, (r /∈ R).

Result: rm : Amρ=⇒ Bm, where rm ∈ R and Am,Bm ⊂ IR.

Am ← best patterns matched to A from IR;Bm ← best patterns matched to B from IR;rm ← Am

ρ=⇒ Bm;

foreach ri : Aiρi=⇒ Bi ∈ R do

if ri = rm (Ai = Am & Bi = Bm & ρi = ρ) thenreturn rm;

return ∅; //rule not found

Algorithm 8.2: Occurrence(R1, R2, IR2 )

Data: Rule set R1, and rule set R2 with set of items IR2 .Result: ratioR1inR2 : Occurrence ratio of R1 in R2.weightR1inR2 ← 0;WeightR2 ← 0;foreach ri ∈ R1 do

r ′ ← RuleMatch(ri,R2, IR2 );if r ′ = ∅ then

weightR1inR2 ← weightR1inR2 + supR2(r′)× confR2(r

′);

foreach rj ∈ R2 doWeightR2 ← WeightR2 + supR2(rj)× confR2(rj);

ratioR1inR2 ← weightR1inR2/WeightR2 ;return ratioR1inR2

The next step is to measure how strong a rule occurs in another rule set. Themethod for checking the occurrence of a rule in another rule set leads to definea non-symmetric similarity measure, called Occurrence R1(R2), the occurrenceratio of R1 in R2 (previously called Appearance ratio in [30]). This measurerepresents how often rules with high support and confidence that appear in R1

also occur in R2, considering their strength in R2. It means that while findingthe closest rules of R2 to the rules in R1, the values of support and confidenceof matched rules are also considered in the occurrence ratio.

Algorithm 8.2 presents computing the occurrence ratio measure, which isscaled by the summation on the support and confidence of the rules in R2.Evidently, if the occurrence ratio of a rule set in another is considerably high, itshows these two rule sets are meaningfully associated. In contrast, if the ratiois considerably low, it indicates of few connections between rule sets, in a sensethat two rule sets are distinct.


Algorithm8.1:RuleMatch(r,R,IR)

Data:r:Aρ=⇒B,R={r1,···,rn}withsetofitemsIR,(r/∈R).

Result:rm:Amρ=⇒Bm,whererm∈RandAm,Bm⊂IR.

Am←bestpatternsmatchedtoAfromIR;Bm←bestpatternsmatchedtoBfromIR;rm←Am

ρ=⇒Bm;

foreachri:Aiρi=⇒Bi∈Rdo

ifri=rm(Ai=Am&Bi=Bm&ρi=ρ)thenreturnrm;

return∅;//rulenotfound

Algorithm8.2:Occurrence(R1,R2,IR2)

Data:RulesetR1,andrulesetR2withsetofitemsIR2.Result:ratioR1inR2:OccurrenceratioofR1inR2.weightR1inR2←0;WeightR2←0;foreachri∈R1do

r′←RuleMatch(ri,R2,IR2);ifr′=∅then

weightR1inR2←weightR1inR2+supR2(r′)×confR2(r′);

foreachrj∈R2doWeightR2←WeightR2+supR2(rj)×confR2(rj);

ratioR1inR2←weightR1inR2/WeightR2;returnratioR1inR2

Thenextstepistomeasurehowstrongaruleoccursinanotherruleset.Themethodforcheckingtheoccurrenceofaruleinanotherrulesetleadstodefineanon-symmetricsimilaritymeasure,calledOccurrenceR1(R2),theoccurrenceratioofR1inR2(previouslycalledAppearanceratioin[30]).ThismeasurerepresentshowoftenruleswithhighsupportandconfidencethatappearinR1

alsooccurinR2,consideringtheirstrengthinR2.ItmeansthatwhilefindingtheclosestrulesofR2totherulesinR1,thevaluesofsupportandconfidenceofmatchedrulesarealsoconsideredintheoccurrenceratio.

Algorithm8.2presentscomputingtheoccurrenceratiomeasure,whichisscaledbythesummationonthesupportandconfidenceoftherulesinR2.Evidently,iftheoccurrenceratioofarulesetinanotherisconsiderablyhigh,itshowsthesetworulesetsaremeaningfullyassociated.Incontrast,iftheratioisconsiderablylow,itindicatesoffewconnectionsbetweenrulesets,inasensethattworulesetsaredistinct.


8.1.4 Results: Distinctive Rules in Clinical Settings

This section presents the experimental results of the rule sets in clinical settingsfrom MIMIC database records. As discussed in Section 7.1.2, the raw data isfetched from the online MIMIC database. This data set includes a set of physio-logical measure nets that each belongs to a subject (i.e., patient) with a certaintype of clinical condition. In this data set, only records that include three healthparameters heart rate (HR), blood pressure (BP) and respiration rate (RR) areconsidered. The average length of all the measurements for a clinical conditionin the data set is about 250 hours, and this average for a subject is around 50hours. For each of the nine clinical conditions described in Table 7.1, tempo-ral rule mining approach is applied to two pairs of sensor data: HR&BP andHR&RR. The output model of the rule mining approach is thus a collection ofrule sets for clinical conditions.

Parameter Selection

To select the optimal values during pattern abstraction and rule mining phases,a voting approach is used considering the strength of the generated rules. Fourparameters are optimised: window size, number of clusters, and the best thresh-olds for support and confidence. The window size is applied between 1 to 5minutes, and the number of clusters is set between 5 to 9. Due to the smallnumber of patients in the data set, optimising the parameters by applying rulemining on the entire measurements will lead to overfitting the model, and itdoes not indicate how well the method will generalise to unknown data sets. So,to avoid overfitting the model, a leave-one-out cross-validation approach [50] isused for finding the best parameters. For each clinical condition, this approachleaves out a patient’s records in each fold of the validation as the hold-out setand applies the modelling on the rest of the patients as the training set.

After preparing the data for both training and hold-out sets for each com-bination of parameters, the validation of parameters are examined with twomeasures: Interest and J-measure. These measures indicate the quality of rulesin different aspects [221].

By voting between the top rules with the highest values in these measureson the hold-out sets in all the iterations (folds), this approach is able to find theparameters that achieve the best average results. More precisely, with differentvalues of w and k, all the models (on the train and test data sets) are gener-ated in each clinical condition. Then, with different thresholds on support andconfidence, the measures Interest and J-measure are calculated.

By voting again on the highest results of measures in the entire models, thebest values of parameters are selected as w = 180 and k = 7. These valuesare obtained with the best cut-off values minsup = 0.05 and minconf =0.45. Figure 8.2 depicts an example of applying the cross-validation approachfor the MI clinical condition. This run includes six iterations (folds) for six


8.1.4Results:DistinctiveRulesinClinicalSettings

ThissectionpresentstheexperimentalresultsoftherulesetsinclinicalsettingsfromMIMICdatabaserecords.AsdiscussedinSection7.1.2,therawdataisfetchedfromtheonlineMIMICdatabase.Thisdatasetincludesasetofphysio-logicalmeasurenetsthateachbelongstoasubject(i.e.,patient)withacertaintypeofclinicalcondition.Inthisdataset,onlyrecordsthatincludethreehealthparametersheartrate(HR),bloodpressure(BP)andrespirationrate(RR)areconsidered.Theaveragelengthofallthemeasurementsforaclinicalconditioninthedatasetisabout250hours,andthisaverageforasubjectisaround50hours.ForeachofthenineclinicalconditionsdescribedinTable7.1,tempo-ralruleminingapproachisappliedtotwopairsofsensordata:HR&BPandHR&RR.Theoutputmodeloftheruleminingapproachisthusacollectionofrulesetsforclinicalconditions.

ParameterSelection

Toselecttheoptimalvaluesduringpatternabstractionandruleminingphases,avotingapproachisusedconsideringthestrengthofthegeneratedrules.Fourparametersareoptimised:windowsize,numberofclusters,andthebestthresh-oldsforsupportandconfidence.Thewindowsizeisappliedbetween1to5minutes,andthenumberofclustersissetbetween5to9.Duetothesmallnumberofpatientsinthedataset,optimisingtheparametersbyapplyingruleminingontheentiremeasurementswillleadtooverfittingthemodel,anditdoesnotindicatehowwellthemethodwillgeneralisetounknowndatasets.So,toavoidoverfittingthemodel,aleave-one-outcross-validationapproach[50]isusedforfindingthebestparameters.Foreachclinicalcondition,thisapproachleavesoutapatient’srecordsineachfoldofthevalidationasthehold-outsetandappliesthemodellingontherestofthepatientsasthetrainingset.

Afterpreparingthedataforbothtrainingandhold-outsetsforeachcom-binationofparameters,thevalidationofparametersareexaminedwithtwomeasures:InterestandJ-measure.Thesemeasuresindicatethequalityofrulesindifferentaspects[221].

Byvotingbetweenthetopruleswiththehighestvaluesinthesemeasuresonthehold-outsetsinalltheiterations(folds),thisapproachisabletofindtheparametersthatachievethebestaverageresults.Moreprecisely,withdifferentvaluesofwandk,allthemodels(onthetrainandtestdatasets)aregener-atedineachclinicalcondition.Then,withdifferentthresholdsonsupportandconfidence,themeasuresInterestandJ-measurearecalculated.

Byvotingagainonthehighestresultsofmeasuresintheentiremodels,thebestvaluesofparametersareselectedasw=180andk=7.Thesevaluesareobtainedwiththebestcut-offvaluesminsup=0.05andminconf=0.45.Figure8.2depictsanexampleofapplyingthecross-validationapproachfortheMIclinicalcondition.Thisrunincludessixiterations(folds)forsix


8.1.4 Results: Distinctive Rules in Clinical Settings

This section presents the experimental results of the rule sets in clinical settingsfrom MIMIC database records. As discussed in Section 7.1.2, the raw data isfetched from the online MIMIC database. This data set includes a set of physio-logical measure nets that each belongs to a subject (i.e., patient) with a certaintype of clinical condition. In this data set, only records that include three healthparameters heart rate (HR), blood pressure (BP) and respiration rate (RR) areconsidered. The average length of all the measurements for a clinical conditionin the data set is about 250 hours, and this average for a subject is around 50hours. For each of the nine clinical conditions described in Table 7.1, tempo-ral rule mining approach is applied to two pairs of sensor data: HR&BP andHR&RR. The output model of the rule mining approach is thus a collection ofrule sets for clinical conditions.

Parameter Selection

To select the optimal values during pattern abstraction and rule mining phases,a voting approach is used considering the strength of the generated rules. Fourparameters are optimised: window size, number of clusters, and the best thresh-olds for support and confidence. The window size is applied between 1 to 5minutes, and the number of clusters is set between 5 to 9. Due to the smallnumber of patients in the data set, optimising the parameters by applying rulemining on the entire measurements will lead to overfitting the model, and itdoes not indicate how well the method will generalise to unknown data sets. So,to avoid overfitting the model, a leave-one-out cross-validation approach [50] isused for finding the best parameters. For each clinical condition, this approachleaves out a patient’s records in each fold of the validation as the hold-out setand applies the modelling on the rest of the patients as the training set.

After preparing the data for both training and hold-out sets for each com-bination of parameters, the validation of parameters are examined with twomeasures: Interest and J-measure. These measures indicate the quality of rulesin different aspects [221].

By voting between the top rules with the highest values in these measureson the hold-out sets in all the iterations (folds), this approach is able to find theparameters that achieve the best average results. More precisely, with differentvalues of w and k, all the models (on the train and test data sets) are gener-ated in each clinical condition. Then, with different thresholds on support andconfidence, the measures Interest and J-measure are calculated.

By voting again on the highest results of measures in the entire models, thebest values of parameters are selected as w = 180 and k = 7. These valuesare obtained with the best cut-off values minsup = 0.05 and minconf =0.45. Figure 8.2 depicts an example of applying the cross-validation approachfor the MI clinical condition. This run includes six iterations (folds) for six


8.1.4Results:DistinctiveRulesinClinicalSettings

ThissectionpresentstheexperimentalresultsoftherulesetsinclinicalsettingsfromMIMICdatabaserecords.AsdiscussedinSection7.1.2,therawdataisfetchedfromtheonlineMIMICdatabase.Thisdatasetincludesasetofphysio-logicalmeasurenetsthateachbelongstoasubject(i.e.,patient)withacertaintypeofclinicalcondition.Inthisdataset,onlyrecordsthatincludethreehealthparametersheartrate(HR),bloodpressure(BP)andrespirationrate(RR)areconsidered.Theaveragelengthofallthemeasurementsforaclinicalconditioninthedatasetisabout250hours,andthisaverageforasubjectisaround50hours.ForeachofthenineclinicalconditionsdescribedinTable7.1,tempo-ralruleminingapproachisappliedtotwopairsofsensordata:HR&BPandHR&RR.Theoutputmodeloftheruleminingapproachisthusacollectionofrulesetsforclinicalconditions.

ParameterSelection

Toselecttheoptimalvaluesduringpatternabstractionandruleminingphases,avotingapproachisusedconsideringthestrengthofthegeneratedrules.Fourparametersareoptimised:windowsize,numberofclusters,andthebestthresh-oldsforsupportandconfidence.Thewindowsizeisappliedbetween1to5minutes,andthenumberofclustersissetbetween5to9.Duetothesmallnumberofpatientsinthedataset,optimisingtheparametersbyapplyingruleminingontheentiremeasurementswillleadtooverfittingthemodel,anditdoesnotindicatehowwellthemethodwillgeneralisetounknowndatasets.So,toavoidoverfittingthemodel,aleave-one-outcross-validationapproach[50]isusedforfindingthebestparameters.Foreachclinicalcondition,thisapproachleavesoutapatient’srecordsineachfoldofthevalidationasthehold-outsetandappliesthemodellingontherestofthepatientsasthetrainingset.

Afterpreparingthedataforbothtrainingandhold-outsetsforeachcom-binationofparameters,thevalidationofparametersareexaminedwithtwomeasures:InterestandJ-measure.Thesemeasuresindicatethequalityofrulesindifferentaspects[221].

Byvotingbetweenthetopruleswiththehighestvaluesinthesemeasuresonthehold-outsetsinalltheiterations(folds),thisapproachisabletofindtheparametersthatachievethebestaverageresults.Moreprecisely,withdifferentvaluesofwandk,allthemodels(onthetrainandtestdatasets)aregener-atedineachclinicalcondition.Then,withdifferentthresholdsonsupportandconfidence,themeasuresInterestandJ-measurearecalculated.

Byvotingagainonthehighestresultsofmeasuresintheentiremodels,thebestvaluesofparametersareselectedasw=180andk=7.Thesevaluesareobtainedwiththebestcut-offvaluesminsup=0.05andminconf=0.45.Figure8.2depictsanexampleofapplyingthecross-validationapproachfortheMIclinicalcondition.Thisrunincludessixiterations(folds)forsix


0.5 0.75 10.55

0.75

0.9

k=5

k=7k=9

w=60

w=180

w=300

k’=5

k’=7

k’=9

w’=60

w’=180

w’=300

Iteration #4

Interest

J−

me

asu

re

k in train w in train k’ in hold−out w’ in hold−out

Figure 8.2: Result of cross-validation approach on a selected iteration for MIcondition. The diagram shows the values of measures Interest and J-measurefor various values of parameters w and k in both training set and holding set.

patients with the MI condition. But just to illustrate the values of two measuresfor each parameter, the figure shows the results of one iteration. As shown inFigure 8.2, the best values of measures Interest and J-measure are achieved bythe mentioned values of w and k for a selected hold-out set.

Output Temporal Rule Sets

The results of rule sets in clinical conditions have been published in two ver-sions by the author of this thesis. In the first version, only the after relationin patterns of HR and the other two variables has been considered. The re-sults of this version are published in [30]), but not detailed in this chapter. Theexplained approach in this chapter (published in [32]) is the extension of theearly version to consider more temporal relations. It is worth mentioning thatthe number of temporal relations is just one aspect of the distinctions betweenthese two approaches. To compare the results of the extended version with theprevious one, let’s call the first approach TRM-ρ1 (temporal rule mining withone relation), and the extended approach TRM-ρ3 (temporal rule mining withthree relations). Figure 8.3 shows the number of rules provided in both TRM-ρ1

and TRM-ρ3 approaches concerning the multivariate time series HR&BP andHR&RR in each clinical condition. The output sets of temporal rules indicatea collection of data-driven features which are independently able to describetheir corresponding clinical conditions. To illustrate the variation of prototyp-


0.50.7510.55

0.75

0.9

k=5

k=7 k=9

w=60

w=180

w=300

k’=5

k’=7

k’=9

w’=60

w’=180

w’=300

Iteration #4

Interest

J−

me

asu

re

k in trainw in traink’ in hold−outw’ in hold−out

Figure8.2:Resultofcross-validationapproachonaselectediterationforMIcondition.ThediagramshowsthevaluesofmeasuresInterestandJ-measureforvariousvaluesofparameterswandkinbothtrainingsetandholdingset.

patientswiththeMIcondition.Butjusttoillustratethevaluesoftwomeasuresforeachparameter,thefigureshowstheresultsofoneiteration.AsshowninFigure8.2,thebestvaluesofmeasuresInterestandJ-measureareachievedbythementionedvaluesofwandkforaselectedhold-outset.

OutputTemporalRuleSets

Theresultsofrulesetsinclinicalconditionshavebeenpublishedintwover-sionsbytheauthorofthisthesis.Inthefirstversion,onlytheafterrelationinpatternsofHRandtheothertwovariableshasbeenconsidered.There-sultsofthisversionarepublishedin[30]),butnotdetailedinthischapter.Theexplainedapproachinthischapter(publishedin[32])istheextensionoftheearlyversiontoconsidermoretemporalrelations.Itisworthmentioningthatthenumberoftemporalrelationsisjustoneaspectofthedistinctionsbetweenthesetwoapproaches.Tocomparetheresultsoftheextendedversionwiththepreviousone,let’scallthefirstapproachTRM-ρ1(temporalruleminingwithonerelation),andtheextendedapproachTRM-ρ3(temporalruleminingwiththreerelations).Figure8.3showsthenumberofrulesprovidedinbothTRM-ρ1

andTRM-ρ3approachesconcerningthemultivariatetimeseriesHR&BPandHR&RRineachclinicalcondition.Theoutputsetsoftemporalrulesindicateacollectionofdata-drivenfeatureswhichareindependentlyabletodescribetheircorrespondingclinicalconditions.Toillustratethevariationofprototyp-


0.5 0.75 10.55

0.75

0.9

k=5

k=7k=9

w=60

w=180

w=300

k’=5

k’=7

k’=9

w’=60

w’=180

w’=300

Iteration #4

Interest

J−

me

asu

re

k in train w in train k’ in hold−out w’ in hold−out

Figure 8.2: Result of cross-validation approach on a selected iteration for MIcondition. The diagram shows the values of measures Interest and J-measurefor various values of parameters w and k in both training set and holding set.

patients with the MI condition. But just to illustrate the values of two measuresfor each parameter, the figure shows the results of one iteration. As shown inFigure 8.2, the best values of measures Interest and J-measure are achieved bythe mentioned values of w and k for a selected hold-out set.

Output Temporal Rule Sets

The results of rule sets in clinical conditions have been published in two ver-sions by the author of this thesis. In the first version, only the after relationin patterns of HR and the other two variables has been considered. The re-sults of this version are published in [30]), but not detailed in this chapter. Theexplained approach in this chapter (published in [32]) is the extension of theearly version to consider more temporal relations. It is worth mentioning thatthe number of temporal relations is just one aspect of the distinctions betweenthese two approaches. To compare the results of the extended version with theprevious one, let’s call the first approach TRM-ρ1 (temporal rule mining withone relation), and the extended approach TRM-ρ3 (temporal rule mining withthree relations). Figure 8.3 shows the number of rules provided in both TRM-ρ1

and TRM-ρ3 approaches concerning the multivariate time series HR&BP andHR&RR in each clinical condition. The output sets of temporal rules indicatea collection of data-driven features which are independently able to describetheir corresponding clinical conditions. To illustrate the variation of prototyp-


0.50.7510.55

0.75

0.9

k=5

k=7 k=9

w=60

w=180

w=300

k’=5

k’=7

k’=9

w’=60

w’=180

w’=300

Iteration #4

Interest

J−

me

asu

re

k in trainw in traink’ in hold−outw’ in hold−out

Figure8.2:Resultofcross-validationapproachonaselectediterationforMIcondition.ThediagramshowsthevaluesofmeasuresInterestandJ-measureforvariousvaluesofparameterswandkinbothtrainingsetandholdingset.

patientswiththeMIcondition.Butjusttoillustratethevaluesoftwomeasuresforeachparameter,thefigureshowstheresultsofoneiteration.AsshowninFigure8.2,thebestvaluesofmeasuresInterestandJ-measureareachievedbythementionedvaluesofwandkforaselectedhold-outset.

OutputTemporalRuleSets

Theresultsofrulesetsinclinicalconditionshavebeenpublishedintwover-sionsbytheauthorofthisthesis.Inthefirstversion,onlytheafterrelationinpatternsofHRandtheothertwovariableshasbeenconsidered.There-sultsofthisversionarepublishedin[30]),butnotdetailedinthischapter.Theexplainedapproachinthischapter(publishedin[32])istheextensionoftheearlyversiontoconsidermoretemporalrelations.Itisworthmentioningthatthenumberoftemporalrelationsisjustoneaspectofthedistinctionsbetweenthesetwoapproaches.Tocomparetheresultsoftheextendedversionwiththepreviousone,let’scallthefirstapproachTRM-ρ1(temporalruleminingwithonerelation),andtheextendedapproachTRM-ρ3(temporalruleminingwiththreerelations).Figure8.3showsthenumberofrulesprovidedinbothTRM-ρ1

andTRM-ρ3approachesconcerningthemultivariatetimeseriesHR&BPandHR&RRineachclinicalcondition.Theoutputsetsoftemporalrulesindicateacollectionofdata-drivenfeatureswhichareindependentlyabletodescribetheircorrespondingclinicalconditions.Toillustratethevariationofprototyp-


Resp. failure Bleed CHF Brain injury Sepsis MI Angina Valve CABG0

5

10

15

20

Num

ber

of R

ule

s

HR&BP

TRM−ρ3

TRM−ρ1


5

10

15

20

25

Num

ber

of R

ule

s

HR&RR

TRM−ρ3

TRM−ρ1

Figure 8.3: The number of rules for clinical conditions in TRM-ρ3 and TRM-ρ1

methods, in relation to the multivariate time series HR&BP and HR&RR.

ical patterns among the temporal rules, a selection of distinct temporal rulesfrom different rule sets with different temporal relations is represented in Fig-ure 8.4.

8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions

This section evaluates the uniqueness of rule sets for clinical conditions. For thisreason, the new evaluation method based on the similarity function proposedin Section 8.1.3 is applied to measure the occurrence ratio of rules in other rulesets. This evaluation is first applied to the rule sets of clinical conditions to showthe distinctness of rule sets. Besides, the rule sets for a selection of subjects inthe same data set are compared with all the clinical conditions to consider thecloseness of subjects to their corresponding conditions.

Occurrence Ratio of Rule Sets in Clinical Conditions

Based on the rule sets achieved from the TRM-ρ3 method for clinical condi-tions, the evaluation approach is applied to each pair of rule sets. For nineclinical categories, the occurrence ratios for temporal rule sets are calculated.


Resp. failureBleedCHFBrain injurySepsisMIAnginaValveCABG0

5

10

15

20

Num

ber o

f Rule

s

HR&BP

TRM−ρ3

TRM−ρ1


5

10

15

20

25

Num

ber o

f Rule

s

HR&RR

TRM−ρ3

TRM−ρ1

Figure8.3:ThenumberofrulesforclinicalconditionsinTRM-ρ3andTRM-ρ1

methods,inrelationtothemultivariatetimeseriesHR&BPandHR&RR.

icalpatternsamongthetemporalrules,aselectionofdistincttemporalrulesfromdifferentrulesetswithdifferenttemporalrelationsisrepresentedinFig-ure8.4.

8.1.5EvaluationofRuleSetSimilarityinClinicalConditions

Thissectionevaluatestheuniquenessofrulesetsforclinicalconditions.Forthisreason,thenewevaluationmethodbasedonthesimilarityfunctionproposedinSection8.1.3isappliedtomeasuretheoccurrenceratioofrulesinotherrulesets.Thisevaluationisfirstappliedtotherulesetsofclinicalconditionstoshowthedistinctnessofrulesets.Besides,therulesetsforaselectionofsubjectsinthesamedatasetarecomparedwithalltheclinicalconditionstoconsidertheclosenessofsubjectstotheircorrespondingconditions.

OccurrenceRatioofRuleSetsinClinicalConditions

BasedontherulesetsachievedfromtheTRM-ρ3methodforclinicalcondi-tions,theevaluationapproachisappliedtoeachpairofrulesets.Fornineclinicalcategories,theoccurrenceratiosfortemporalrulesetsarecalculated.



5

10

15

20

Num

ber

of R

ule

s

HR&BP

TRM−ρ3

TRM−ρ1


5

10

15

20

25

Num

ber

of R

ule

s

HR&RR

TRM−ρ3

TRM−ρ1

Figure 8.3: The number of rules for clinical conditions in TRM-ρ3 and TRM-ρ1

methods, in relation to the multivariate time series HR&BP and HR&RR.

ical patterns among the temporal rules, a selection of distinct temporal rulesfrom different rule sets with different temporal relations is represented in Fig-ure 8.4.

8.1.5 Evaluation of Rule Set Similarity in Clinical Conditions

This section evaluates the uniqueness of rule sets for clinical conditions. For thisreason, the new evaluation method based on the similarity function proposedin Section 8.1.3 is applied to measure the occurrence ratio of rules in other rulesets. This evaluation is first applied to the rule sets of clinical conditions to showthe distinctness of rule sets. Besides, the rule sets for a selection of subjects inthe same data set are compared with all the clinical conditions to consider thecloseness of subjects to their corresponding conditions.

Occurrence Ratio of Rule Sets in Clinical Conditions

Based on the rule sets achieved from the TRM-ρ3 method for clinical condi-tions, the evaluation approach is applied to each pair of rule sets. For nineclinical categories, the occurrence ratios for temporal rule sets are calculated.



5

10

15

20

Num

ber o

f Rule

s

HR&BP

TRM−ρ3

TRM−ρ1


5

10

15

20

25

Num

ber o

f Rule

s

HR&RR

TRM−ρ3

TRM−ρ1

Figure8.3:ThenumberofrulesforclinicalconditionsinTRM-ρ3andTRM-ρ1

methods,inrelationtothemultivariatetimeseriesHR&BPandHR&RR.

icalpatternsamongthetemporalrules,aselectionofdistincttemporalrulesfromdifferentrulesetswithdifferenttemporalrelationsisrepresentedinFig-ure8.4.

8.1.5EvaluationofRuleSetSimilarityinClinicalConditions

Thissectionevaluatestheuniquenessofrulesetsforclinicalconditions.Forthisreason,thenewevaluationmethodbasedonthesimilarityfunctionproposedinSection8.1.3isappliedtomeasuretheoccurrenceratioofrulesinotherrulesets.Thisevaluationisfirstappliedtotherulesetsofclinicalconditionstoshowthedistinctnessofrulesets.Besides,therulesetsforaselectionofsubjectsinthesamedatasetarecomparedwithalltheclinicalconditionstoconsidertheclosenessofsubjectstotheircorrespondingconditions.

OccurrenceRatioofRuleSetsinClinicalConditions

BasedontherulesetsachievedfromtheTRM-ρ3methodforclinicalcondi-tions,theevaluationapproachisappliedtoeachpairofrulesets.Fornineclinicalcategories,theoccurrenceratiosfortemporalrulesetsarecalculated.


50 100 150

−2

0

2

beats

/min

HR

50 100 150

−5

0

5

mm

Hg

BP

(a) Bleed, ‘equal’,sup=30%, conf=55%

50 100 150

−0.2

0

0.2

bre

ath

s/m

in

RR

50 100 150

−0.1

0

0.1

0.2

beats

/min

HR

(b) MI, ‘equal’,sup=25%, conf=64%

50 100 150

−8

−6

−4

−2

0

2

4

beats

/min

HR

50 100 150

−0.1

0

0.1

0.2

mm

Hg

BP

(c) Angina, ‘before’,sup=90%, conf=88%

50 100 150

−4

−2

0

2

bre

ath

s/m

in

RR

50 100 150

−0.1

0

0.1

beats

/min

HR

(d) Valve, ‘before’,sup=80%, conf=83%

50 100 150

−2

−1

0

1

2beats

/min

HR

50 100 150

−0.1

−0.05

0

0.05

0.1

mm

Hg

BP

(e) MI, ‘after’,sup=60%, conf=70%

50 100 150

−1

0

1

2

3

bre

ath

s/m

in

RR

50 100 150−4

−2

0

2

4

6

beats

/min

HR

(f) Sepsis, ‘after’,sup=50%, conf=86%

Figure 8.4: A selection of distinct temporal rules generated from physiologicaldata in clinical conditions using TRM-ρ3 approach.

As an example of output ratios, the matrix in Table 8.1 shows the obtainedvalues of the occurrence ratio for temporal rule sets in HR&RR time series.Since the occurrence ratio is a non-symmetric similarity function, the values inTable 8.1 are not symmetric. For instance, the Occurrence RMI

(RCHF) is 27%,whereas Occurrence RCHF

(RMI) is 16%. The main reason for this variation isthat the occurrence ratio is a weighted function which is calculated based onthe support and confidence of the rules in only the second rule set. So, a subsetof temporal rules with strong support and confidence values in their own ruleset may appear in another rule set with weak corresponding support and con-fidence in the second one. The results in the matrix show the low occurrenceratios between the rule sets of clinical conditions.

In comparison with the former method TRM-ρ1, the proposed approach fortemporal rule mining TRM-ρ3 is also providing much lower values of occur-rence ratios between clinical conditions since the temporal rules are more spe-cialised using the extra temporal relations. Figure 8.5 depicts a graphical com-parison between the results of the occurrence ratios based on two approachesTRM-ρ1 and TRM-ρ3 in a box plot diagram. This figure shows most of the ratio


50100150

−2

0

2

beats

/min

HR

50100150

−5

0

5

mm

Hg

BP

(a)Bleed,‘equal’,sup=30%,conf=55%

50100150

−0.2

0

0.2

bre

ath

s/m

in

RR

50100150

−0.1

0

0.1

0.2

beats

/min

HR

(b)MI,‘equal’,sup=25%,conf=64%

50100150

−8

−6

−4

−2

0

2

4

beats

/min

HR

50100150

−0.1

0

0.1

0.2

mm

Hg

BP

(c)Angina,‘before’,sup=90%,conf=88%

50100150

−4

−2

0

2

bre

ath

s/m

in

RR

50100150

−0.1

0

0.1

beats

/min

HR

(d)Valve,‘before’,sup=80%,conf=83%

50100150

−2

−1

0

1

2

beats

/min

HR

50100150

−0.1

−0.05

0

0.05

0.1

mm

Hg

BP

(e)MI,‘after’,sup=60%,conf=70%

50100150

−1

0

1

2

3

bre

ath

s/m

in

RR

50100150−4

−2

0

2

4

6

beats

/min

HR

(f)Sepsis,‘after’,sup=50%,conf=86%

Figure8.4:AselectionofdistincttemporalrulesgeneratedfromphysiologicaldatainclinicalconditionsusingTRM-ρ3approach.

Asanexampleofoutputratios,thematrixinTable8.1showstheobtainedvaluesoftheoccurrenceratiofortemporalrulesetsinHR&RRtimeseries.Sincetheoccurrenceratioisanon-symmetricsimilarityfunction,thevaluesinTable8.1arenotsymmetric.Forinstance,theOccurrenceRMI(RCHF)is27%,whereasOccurrenceRCHF(RMI)is16%.Themainreasonforthisvariationisthattheoccurrenceratioisaweightedfunctionwhichiscalculatedbasedonthesupportandconfidenceoftherulesinonlythesecondruleset.So,asubsetoftemporalruleswithstrongsupportandconfidencevaluesintheirownrulesetmayappearinanotherrulesetwithweakcorrespondingsupportandcon-fidenceinthesecondone.Theresultsinthematrixshowthelowoccurrenceratiosbetweentherulesetsofclinicalconditions.

IncomparisonwiththeformermethodTRM-ρ1,theproposedapproachfortemporalruleminingTRM-ρ3isalsoprovidingmuchlowervaluesofoccur-renceratiosbetweenclinicalconditionssincethetemporalrulesaremorespe-cialisedusingtheextratemporalrelations.Figure8.5depictsagraphicalcom-parisonbetweentheresultsoftheoccurrenceratiosbasedontwoapproachesTRM-ρ1andTRM-ρ3inaboxplotdiagram.Thisfigureshowsmostoftheratio


50 100 150

−2

0

2

beats

/min

HR

50 100 150

−5

0

5

mm

Hg

BP

(a) Bleed, ‘equal’,sup=30%, conf=55%

50 100 150

−0.2

0

0.2

bre

ath

s/m

in

RR

50 100 150

−0.1

0

0.1

0.2

beats

/min

HR

(b) MI, ‘equal’,sup=25%, conf=64%

50 100 150

−8

−6

−4

−2

0

2

4

beats

/min

HR

50 100 150

−0.1

0

0.1

0.2

mm

Hg

BP

(c) Angina, ‘before’,sup=90%, conf=88%

50 100 150

−4

−2

0

2

bre

ath

s/m

in

RR

50 100 150

−0.1

0

0.1

beats

/min

HR

(d) Valve, ‘before’,sup=80%, conf=83%

50 100 150

−2

−1

0

1

2

beats

/min

HR

50 100 150

−0.1

−0.05

0

0.05

0.1

mm

Hg

BP

(e) MI, ‘after’,sup=60%, conf=70%

50 100 150

−1

0

1

2

3

bre

ath

s/m

in

RR

50 100 150−4

−2

0

2

4

6

beats

/min

HR

(f) Sepsis, ‘after’,sup=50%, conf=86%

Figure 8.4: A selection of distinct temporal rules generated from physiologicaldata in clinical conditions using TRM-ρ3 approach.

As an example of output ratios, the matrix in Table 8.1 shows the obtainedvalues of the occurrence ratio for temporal rule sets in HR&RR time series.Since the occurrence ratio is a non-symmetric similarity function, the values inTable 8.1 are not symmetric. For instance, the Occurrence RMI

(RCHF) is 27%,whereas Occurrence RCHF

(RMI) is 16%. The main reason for this variation isthat the occurrence ratio is a weighted function which is calculated based onthe support and confidence of the rules in only the second rule set. So, a subsetof temporal rules with strong support and confidence values in their own ruleset may appear in another rule set with weak corresponding support and con-fidence in the second one. The results in the matrix show the low occurrenceratios between the rule sets of clinical conditions.

In comparison with the former method TRM-ρ1, the proposed approach fortemporal rule mining TRM-ρ3 is also providing much lower values of occur-rence ratios between clinical conditions since the temporal rules are more spe-cialised using the extra temporal relations. Figure 8.5 depicts a graphical com-parison between the results of the occurrence ratios based on two approachesTRM-ρ1 and TRM-ρ3 in a box plot diagram. This figure shows most of the ratio


50100150

−2

0

2

beats

/min

HR

50100150

−5

0

5

mm

Hg

BP

(a)Bleed,‘equal’,sup=30%,conf=55%

50100150

−0.2

0

0.2

bre

ath

s/m

in

RR

50100150

−0.1

0

0.1

0.2

beats

/min

HR

(b)MI,‘equal’,sup=25%,conf=64%

50100150

−8

−6

−4

−2

0

2

4

beats

/min

HR

50100150

−0.1

0

0.1

0.2

mm

Hg

BP

(c)Angina,‘before’,sup=90%,conf=88%

50100150

−4

−2

0

2

bre

ath

s/m

in

RR

50100150

−0.1

0

0.1

beats

/min

HR

(d)Valve,‘before’,sup=80%,conf=83%

50100150

−2

−1

0

1

2

beats

/min

HR

50100150

−0.1

−0.05

0

0.05

0.1

mm

Hg

BP

(e)MI,‘after’,sup=60%,conf=70%

50100150

−1

0

1

2

3

bre

ath

s/m

in

RR

50100150−4

−2

0

2

4

6

beats

/min

HR

(f)Sepsis,‘after’,sup=50%,conf=86%

Figure8.4:AselectionofdistincttemporalrulesgeneratedfromphysiologicaldatainclinicalconditionsusingTRM-ρ3approach.

Asanexampleofoutputratios,thematrixinTable8.1showstheobtainedvaluesoftheoccurrenceratiofortemporalrulesetsinHR&RRtimeseries.Sincetheoccurrenceratioisanon-symmetricsimilarityfunction,thevaluesinTable8.1arenotsymmetric.Forinstance,theOccurrenceRMI(RCHF)is27%,whereasOccurrenceRCHF(RMI)is16%.Themainreasonforthisvariationisthattheoccurrenceratioisaweightedfunctionwhichiscalculatedbasedonthesupportandconfidenceoftherulesinonlythesecondruleset.So,asubsetoftemporalruleswithstrongsupportandconfidencevaluesintheirownrulesetmayappearinanotherrulesetwithweakcorrespondingsupportandcon-fidenceinthesecondone.Theresultsinthematrixshowthelowoccurrenceratiosbetweentherulesetsofclinicalconditions.

IncomparisonwiththeformermethodTRM-ρ1,theproposedapproachfortemporalruleminingTRM-ρ3isalsoprovidingmuchlowervaluesofoccur-renceratiosbetweenclinicalconditionssincethetemporalrulesaremorespe-cialisedusingtheextratemporalrelations.Figure8.5depictsagraphicalcom-parisonbetweentheresultsoftheoccurrenceratiosbasedontwoapproachesTRM-ρ1andTRM-ρ3inaboxplotdiagram.Thisfigureshowsmostoftheratio


Table 8.1: Occurrence ratios of rule sets for each pair of clinical conditions inmultivariate time series HR&RR, using TRM-ρ3.

Res

p.fa

ilure

Ble

ed

CH

F

Bra

inin

jury

Seps

is

MI

Ang

ina

Post

-op

Val

ve

Post

-op

CA

BG

Resp. failure - 67% 0.2% 0% 3% 3% 2% 3% 7%Bleed 61% - 1% 6% 8% 2% 8% 3% 0.5%CHF 8% 4% - 1% 3% 16% 3% 0% 38%

Brain injury 11% 8% 0% - 0% 1% 0.5% 2% 13%Sepsis 3% 3% 3% 0% - 0% 1% 0% 0%

MI 2% 14% 27% 4% 0% - 3% 32% 15%Angina 1% 49% 5% 1% 0.1% 1% - 3% 27%

Post-op Valve 12% 2% 1% 10% 1% 16% 9% - 55%Post-op CABG 1% 0.5% 32% 41% 0.5% 6% 14% 65% -

values are close to the zero in both versions, but much closer to the zero for theTRM-ρ3 method. More precisely, for the occurrence ratios in clinical conditionsfor time series, more than 90% of all occurrence ratios are lower than 30%.Also, 83% of them are lower than 15% (in TRM-ρ1 it was 70% lower than 15%).So, this evaluation to some extent can guarantee that the temporal rule miningmethods generate relatively distinctive rule sets, which the rules in one categoryof the clinical condition can sufficiently provide an individual behaviour of itsvital signs.

Occurrence Ratio of Subjects in Clinical Conditions

Another aspect of validating the performance of rule set similarity measure isto show the robustness of this measure for analogous rule sets. For this reason,some individual subjects with specific clinical condition labels are consideredthrough the use of TRM-ρ3. The temporal rule set of each subject is then com-pared with the rule sets of each clinical category via measuring their occurrenceratio in the rule sets of clinical conditions, using a leave-one-out method toavoid overfitting the modelling and the comparison of rule sets. If the occur-rence ratio of a subject is higher in its corresponding clinical condition, ratherthan other conditions, then it shows the closeness of the subject’s rule set and itscorresponding condition’s model. In other words, every provided rule set canrepresent a descriptive model of specific features which is recognisable from theother models.

Since the number of subjects in some clinical conditions is not enough, toavoid having biased results, the subjects with four major clinical labels, Res-piratory failure, CHF, MI, Sepsis (33 subjects in total) are tested. The otherclinical conditions do not have enough subjects to perform this evaluation. The


Table8.1:OccurrenceratiosofrulesetsforeachpairofclinicalconditionsinmultivariatetimeseriesHR&RR,usingTRM-ρ3.

Resp.

failure

Bleed

CH

F

Brain

injury

Sepsis

MI

Angina

Post-opV

alve

Post-opC

AB

G

Resp.failure-67%0.2%0%3%3%2%3%7%Bleed61%-1%6%8%2%8%3%0.5%CHF8%4%-1%3%16%3%0%38%

Braininjury11%8%0%-0%1%0.5%2%13%Sepsis3%3%3%0%-0%1%0%0%

MI2%14%27%4%0%-3%32%15%Angina1%49%5%1%0.1%1%-3%27%

Post-opValve12%2%1%10%1%16%9%-55%Post-opCABG1%0.5%32%41%0.5%6%14%65%-

valuesareclosetothezeroinbothversions,butmuchclosertothezerofortheTRM-ρ3method.Moreprecisely,fortheoccurrenceratiosinclinicalconditionsfortimeseries,morethan90%ofalloccurrenceratiosarelowerthan30%.Also,83%ofthemarelowerthan15%(inTRM-ρ1itwas70%lowerthan15%).So,thisevaluationtosomeextentcanguaranteethatthetemporalruleminingmethodsgeneraterelativelydistinctiverulesets,whichtherulesinonecategoryoftheclinicalconditioncansufficientlyprovideanindividualbehaviourofitsvitalsigns.

OccurrenceRatioofSubjectsinClinicalConditions

Anotheraspectofvalidatingtheperformanceofrulesetsimilaritymeasureistoshowtherobustnessofthismeasureforanalogousrulesets.Forthisreason,someindividualsubjectswithspecificclinicalconditionlabelsareconsideredthroughtheuseofTRM-ρ3.Thetemporalrulesetofeachsubjectisthencom-paredwiththerulesetsofeachclinicalcategoryviameasuringtheiroccurrenceratiointherulesetsofclinicalconditions,usingaleave-one-outmethodtoavoidoverfittingthemodellingandthecomparisonofrulesets.Iftheoccur-renceratioofasubjectishigherinitscorrespondingclinicalcondition,ratherthanotherconditions,thenitshowstheclosenessofthesubject’srulesetanditscorrespondingcondition’smodel.Inotherwords,everyprovidedrulesetcanrepresentadescriptivemodelofspecificfeatureswhichisrecognisablefromtheothermodels.

Sincethenumberofsubjectsinsomeclinicalconditionsisnotenough,toavoidhavingbiasedresults,thesubjectswithfourmajorclinicallabels,Res-piratoryfailure,CHF,MI,Sepsis(33subjectsintotal)aretested.Theotherclinicalconditionsdonothaveenoughsubjectstoperformthisevaluation.The


Table 8.1: Occurrence ratios of rule sets for each pair of clinical conditions inmultivariate time series HR&RR, using TRM-ρ3.

Res

p.fa

ilure

Ble

ed

CH

F

Bra

inin

jury

Seps

is

MI

Ang

ina

Post

-op

Val

ve

Post

-op

CA

BG

Resp. failure - 67% 0.2% 0% 3% 3% 2% 3% 7%Bleed 61% - 1% 6% 8% 2% 8% 3% 0.5%CHF 8% 4% - 1% 3% 16% 3% 0% 38%

Brain injury 11% 8% 0% - 0% 1% 0.5% 2% 13%Sepsis 3% 3% 3% 0% - 0% 1% 0% 0%

MI 2% 14% 27% 4% 0% - 3% 32% 15%Angina 1% 49% 5% 1% 0.1% 1% - 3% 27%

Post-op Valve 12% 2% 1% 10% 1% 16% 9% - 55%Post-op CABG 1% 0.5% 32% 41% 0.5% 6% 14% 65% -

values are close to the zero in both versions, but much closer to the zero for theTRM-ρ3 method. More precisely, for the occurrence ratios in clinical conditionsfor time series, more than 90% of all occurrence ratios are lower than 30%.Also, 83% of them are lower than 15% (in TRM-ρ1 it was 70% lower than 15%).So, this evaluation to some extent can guarantee that the temporal rule miningmethods generate relatively distinctive rule sets, which the rules in one categoryof the clinical condition can sufficiently provide an individual behaviour of itsvital signs.

Occurrence Ratio of Subjects in Clinical Conditions

Another aspect of validating the performance of rule set similarity measure isto show the robustness of this measure for analogous rule sets. For this reason,some individual subjects with specific clinical condition labels are consideredthrough the use of TRM-ρ3. The temporal rule set of each subject is then com-pared with the rule sets of each clinical category via measuring their occurrenceratio in the rule sets of clinical conditions, using a leave-one-out method toavoid overfitting the modelling and the comparison of rule sets. If the occur-rence ratio of a subject is higher in its corresponding clinical condition, ratherthan other conditions, then it shows the closeness of the subject’s rule set and itscorresponding condition’s model. In other words, every provided rule set canrepresent a descriptive model of specific features which is recognisable from theother models.

Since the number of subjects in some clinical conditions is not enough, toavoid having biased results, the subjects with four major clinical labels, Res-piratory failure, CHF, MI, Sepsis (33 subjects in total) are tested. The otherclinical conditions do not have enough subjects to perform this evaluation. The


Table8.1:OccurrenceratiosofrulesetsforeachpairofclinicalconditionsinmultivariatetimeseriesHR&RR,usingTRM-ρ3.

Resp.

failure

Bleed

CH

F

Brain

injury

Sepsis

MI

Angina

Post-opV

alve

Post-opC

AB

G

Resp.failure-67%0.2%0%3%3%2%3%7%Bleed61%-1%6%8%2%8%3%0.5%CHF8%4%-1%3%16%3%0%38%

Braininjury11%8%0%-0%1%0.5%2%13%Sepsis3%3%3%0%-0%1%0%0%

MI2%14%27%4%0%-3%32%15%Angina1%49%5%1%0.1%1%-3%27%

Post-opValve12%2%1%10%1%16%9%-55%Post-opCABG1%0.5%32%41%0.5%6%14%65%-

valuesareclosetothezeroinbothversions,butmuchclosertothezerofortheTRM-ρ3method.Moreprecisely,fortheoccurrenceratiosinclinicalconditionsfortimeseries,morethan90%ofalloccurrenceratiosarelowerthan30%.Also,83%ofthemarelowerthan15%(inTRM-ρ1itwas70%lowerthan15%).So,thisevaluationtosomeextentcanguaranteethatthetemporalruleminingmethodsgeneraterelativelydistinctiverulesets,whichtherulesinonecategoryoftheclinicalconditioncansufficientlyprovideanindividualbehaviourofitsvitalsigns.

OccurrenceRatioofSubjectsinClinicalConditions

Anotheraspectofvalidatingtheperformanceofrulesetsimilaritymeasureistoshowtherobustnessofthismeasureforanalogousrulesets.Forthisreason,someindividualsubjectswithspecificclinicalconditionlabelsareconsideredthroughtheuseofTRM-ρ3.Thetemporalrulesetofeachsubjectisthencom-paredwiththerulesetsofeachclinicalcategoryviameasuringtheiroccurrenceratiointherulesetsofclinicalconditions,usingaleave-one-outmethodtoavoidoverfittingthemodellingandthecomparisonofrulesets.Iftheoccur-renceratioofasubjectishigherinitscorrespondingclinicalcondition,ratherthanotherconditions,thenitshowstheclosenessofthesubject’srulesetanditscorrespondingcondition’smodel.Inotherwords,everyprovidedrulesetcanrepresentadescriptivemodelofspecificfeatureswhichisrecognisablefromtheothermodels.

Sincethenumberofsubjectsinsomeclinicalconditionsisnotenough,toavoidhavingbiasedresults,thesubjectswithfourmajorclinicallabels,Res-piratoryfailure,CHF,MI,Sepsis(33subjectsintotal)aretested.Theotherclinicalconditionsdonothaveenoughsubjectstoperformthisevaluation.The

8.2. LINGUISTIC DESCRIPTIONS FOR PATTERNS AND TEMPORAL RULES131

Resp. failure Bleed CHF Brain injury Sepsis MI Angina Valve CABG0%

20%

40%

60%

80%

100%

Clinical Conditions

Occure

nce R

atio

TRM−ρ3

TRM−ρ1

Figure 8.5: Boxplot diagram of the occurrence ratios between one clinical con-dition’s rule set and the other conditions with TRM-ρ3 (each row in Table 8.1),in comparison with the results of TRM-ρ1.

rule set of a subject sbj is compared with every rule set achieved from nineclinical conditions Ci through the calculation of Occurrence Rsbj

(RCi), where

1 � i � 9. For each subject, two clinical conditions with top occurrence ratioshave been selected as nearest conditions to the subject. If the closest conditionsto a subject involve the same clinical label as the subject’s label, it shows arich similarity between the rule sets of the subject and its corresponding clinicalcondition. The number of subjects with the same clinical label in their nearestconditions can indicate the correctness of the proposed evaluation.

Table 8.2 shows the three significant clinical conditions, with the number ofsubjects in each of them, in which their clinical labels have been revealed as oneof their nearest conditions. For the subjects in CHF, MI, and Sepsis conditions,most of the rule sets were significantly close to their clinical model. However,the behaviour of rule sets for the Respiratory failure condition and its subjectsare not adequately similar.

8.2 Linguistic Descriptions for Patterns and

Temporal Rules

This section presents how a set of data-driven information can be turned intoa set of natural language sentences that are humanly understandable. In par-ticular, it explains template-based approaches which directly map the output

8.2.LINGUISTICDESCRIPTIONSFORPATTERNSANDTEMPORALRULES131

Resp. failureBleedCHFBrain injurySepsisMIAnginaValveCABG0%

20%

40%

60%

80%

100%

Clinical Conditions

Occure

nce R

atio

TRM−ρ3

TRM−ρ1

Figure8.5:Boxplotdiagramoftheoccurrenceratiosbetweenoneclinicalcon-dition’srulesetandtheotherconditionswithTRM-ρ3(eachrowinTable8.1),incomparisonwiththeresultsofTRM-ρ1.

rulesetofasubjectsbjiscomparedwitheveryrulesetachievedfromnineclinicalconditionsCithroughthecalculationofOccurrenceRsbj(RCi),where1�i�9.Foreachsubject,twoclinicalconditionswithtopoccurrenceratioshavebeenselectedasnearestconditionstothesubject.Iftheclosestconditionstoasubjectinvolvethesameclinicallabelasthesubject’slabel,itshowsarichsimilaritybetweentherulesetsofthesubjectanditscorrespondingclinicalcondition.Thenumberofsubjectswiththesameclinicallabelintheirnearestconditionscanindicatethecorrectnessoftheproposedevaluation.

Table8.2showsthethreesignificantclinicalconditions,withthenumberofsubjectsineachofthem,inwhichtheirclinicallabelshavebeenrevealedasoneoftheirnearestconditions.ForthesubjectsinCHF,MI,andSepsisconditions,mostoftherulesetsweresignificantlyclosetotheirclinicalmodel.However,thebehaviourofrulesetsfortheRespiratoryfailureconditionanditssubjectsarenotadequatelysimilar.

8.2LinguisticDescriptionsforPatternsand

TemporalRules

Thissectionpresentshowasetofdata-driveninformationcanbeturnedintoasetofnaturallanguagesentencesthatarehumanlyunderstandable.Inpar-ticular,itexplainstemplate-basedapproacheswhichdirectlymaptheoutput

8.2. LINGUISTIC DESCRIPTIONS FOR PATTERNS AND TEMPORAL RULES131

Resp. failure Bleed CHF Brain injury Sepsis MI Angina Valve CABG0%

20%

40%

60%

80%

100%

Clinical Conditions

Occure

nce R

atio

TRM−ρ3

TRM−ρ1

Figure 8.5: Boxplot diagram of the occurrence ratios between one clinical con-dition’s rule set and the other conditions with TRM-ρ3 (each row in Table 8.1),in comparison with the results of TRM-ρ1.

rule set of a subject sbj is compared with every rule set achieved from nineclinical conditions Ci through the calculation of Occurrence Rsbj

(RCi), where

1 � i � 9. For each subject, two clinical conditions with top occurrence ratioshave been selected as nearest conditions to the subject. If the closest conditionsto a subject involve the same clinical label as the subject’s label, it shows arich similarity between the rule sets of the subject and its corresponding clinicalcondition. The number of subjects with the same clinical label in their nearestconditions can indicate the correctness of the proposed evaluation.

Table 8.2 shows the three significant clinical conditions, with the number ofsubjects in each of them, in which their clinical labels have been revealed as oneof their nearest conditions. For the subjects in CHF, MI, and Sepsis conditions,most of the rule sets were significantly close to their clinical model. However,the behaviour of rule sets for the Respiratory failure condition and its subjectsare not adequately similar.

8.2 Linguistic Descriptions for Patterns and

Temporal Rules

This section presents how a set of data-driven information can be turned intoa set of natural language sentences that are humanly understandable. In par-ticular, it explains template-based approaches which directly map the output

8.2.LINGUISTICDESCRIPTIONSFORPATTERNSANDTEMPORALRULES131

Resp. failureBleedCHFBrain injurySepsisMIAnginaValveCABG0%

20%

40%

60%

80%

100%

Clinical Conditions

Occure

nce R

atio

TRM−ρ3

TRM−ρ1

Figure8.5:Boxplotdiagramoftheoccurrenceratiosbetweenoneclinicalcon-dition’srulesetandtheotherconditionswithTRM-ρ3(eachrowinTable8.1),incomparisonwiththeresultsofTRM-ρ1.

rulesetofasubjectsbjiscomparedwitheveryrulesetachievedfromnineclinicalconditionsCithroughthecalculationofOccurrenceRsbj(RCi),where1�i�9.Foreachsubject,twoclinicalconditionswithtopoccurrenceratioshavebeenselectedasnearestconditionstothesubject.Iftheclosestconditionstoasubjectinvolvethesameclinicallabelasthesubject’slabel,itshowsarichsimilaritybetweentherulesetsofthesubjectanditscorrespondingclinicalcondition.Thenumberofsubjectswiththesameclinicallabelintheirnearestconditionscanindicatethecorrectnessoftheproposedevaluation.

Table8.2showsthethreesignificantclinicalconditions,withthenumberofsubjectsineachofthem,inwhichtheirclinicallabelshavebeenrevealedasoneoftheirnearestconditions.ForthesubjectsinCHF,MI,andSepsisconditions,mostoftherulesetsweresignificantlyclosetotheirclinicalmodel.However,thebehaviourofrulesetsfortheRespiratoryfailureconditionanditssubjectsarenotadequatelysimilar.

8.2LinguisticDescriptionsforPatternsand

TemporalRules

Thissectionpresentshowasetofdata-driveninformationcanbeturnedintoasetofnaturallanguagesentencesthatarehumanlyunderstandable.Inpar-ticular,itexplainstemplate-basedapproacheswhichdirectlymaptheoutput


Table 8.2: Subjects with the same condition in their nearest rule sets.

Selected clinicalcondition

No. of SubjectsNo. of nearest

conditions withsame label

percentage (%)

CHF 13 11 85%

MI 6 5 83%

Resp. failure 10 6 60%

Sepsis 4 3 75%

numerical information to linguistic characterisations. These approaches are ap-plied to three sets of information derived from data analysis methods: partialtrends, prototypical patterns, and temporal rules of patterns. It is notable thatthe proposed approaches here are heuristic and are the first attempt to generatenatural language text for derived numerical information. It will be shown inChapter 9 that how semantic representations can be involved in the process oftext generation in order to enrich the final description of such information.

8.2.1 Trend and Pattern Description

A text generation method proposed in [29] provides a framework to detectand represent partial trends in sequential patterns. The method first detects thepartial trends of an input time series based on their numeric features such asslope and duration. Then it characterises the partial trends in a textual formusing a mapping function between numeric and symbolic terms such as suddenincrease, steady decay, much fluctuated, and so on. By employing this method,the patterns in a temporal rule can be described based on their partial trends.The benefit of using natural language generation to represent the trends is thatall the temporal events from a set of physiological time series data could besummarised in a textual output, which helps the end user to get a global per-spective of the repetitive patterns and their temporal correlations in a massiveamount of measurements.

The linguistic description approach applied for trend characterisation is in-spired by the NLG architecture proposed by Reiter and Dale [185]. As pre-sented in Chapter 2, this architecture includes the steps of data interpretation,document planning, microplanning and realisation (See Section 2.3.2 for moredetails). This section describes the linguistic characterisation of the detectedtrends only in the data analysis phase which is a part of microplanning mod-ule. For other tasks in NLG system, the framework follows the developed NLGmethods in [185] or more recently in [119].

While extracting partial trends from time series data to represent them innatural language, the orientation of detected trends is interpreted in linguistic


Table8.2:Subjectswiththesameconditionintheirnearestrulesets.

Selectedclinicalcondition

No.ofSubjectsNo.ofnearest

conditionswithsamelabel

percentage(%)

CHF131185%

MI6583%

Resp.failure10660%

Sepsis4375%

numericalinformationtolinguisticcharacterisations.Theseapproachesareap-pliedtothreesetsofinformationderivedfromdataanalysismethods:partialtrends,prototypicalpatterns,andtemporalrulesofpatterns.Itisnotablethattheproposedapproacheshereareheuristicandarethefirstattempttogeneratenaturallanguagetextforderivednumericalinformation.ItwillbeshowninChapter9thathowsemanticrepresentationscanbeinvolvedintheprocessoftextgenerationinordertoenrichthefinaldescriptionofsuchinformation.

8.2.1TrendandPatternDescription

Atextgenerationmethodproposedin[29]providesaframeworktodetectandrepresentpartialtrendsinsequentialpatterns.Themethodfirstdetectsthepartialtrendsofaninputtimeseriesbasedontheirnumericfeaturessuchasslopeandduration.Thenitcharacterisesthepartialtrendsinatextualformusingamappingfunctionbetweennumericandsymbolictermssuchassuddenincrease,steadydecay,muchfluctuated,andsoon.Byemployingthismethod,thepatternsinatemporalrulecanbedescribedbasedontheirpartialtrends.Thebenefitofusingnaturallanguagegenerationtorepresentthetrendsisthatallthetemporaleventsfromasetofphysiologicaltimeseriesdatacouldbesummarisedinatextualoutput,whichhelpstheendusertogetaglobalper-spectiveoftherepetitivepatternsandtheirtemporalcorrelationsinamassiveamountofmeasurements.

Thelinguisticdescriptionapproachappliedfortrendcharacterisationisin-spiredbytheNLGarchitectureproposedbyReiterandDale[185].Aspre-sentedinChapter2,thisarchitectureincludesthestepsofdatainterpretation,documentplanning,microplanningandrealisation(SeeSection2.3.2formoredetails).Thissectiondescribesthelinguisticcharacterisationofthedetectedtrendsonlyinthedataanalysisphasewhichisapartofmicroplanningmod-ule.ForothertasksinNLGsystem,theframeworkfollowsthedevelopedNLGmethodsin[185]ormorerecentlyin[119].

Whileextractingpartialtrendsfromtimeseriesdatatorepresenttheminnaturallanguage,theorientationofdetectedtrendsisinterpretedinlinguistic


Table 8.2: Subjects with the same condition in their nearest rule sets.

Selected clinicalcondition

No. of SubjectsNo. of nearest

conditions withsame label

percentage (%)

CHF 13 11 85%

MI 6 5 83%

Resp. failure 10 6 60%

Sepsis 4 3 75%

numerical information to linguistic characterisations. These approaches are ap-plied to three sets of information derived from data analysis methods: partialtrends, prototypical patterns, and temporal rules of patterns. It is notable thatthe proposed approaches here are heuristic and are the first attempt to generatenatural language text for derived numerical information. It will be shown inChapter 9 that how semantic representations can be involved in the process oftext generation in order to enrich the final description of such information.

8.2.1 Trend and Pattern Description

A text generation method proposed in [29] provides a framework to detectand represent partial trends in sequential patterns. The method first detects thepartial trends of an input time series based on their numeric features such asslope and duration. Then it characterises the partial trends in a textual formusing a mapping function between numeric and symbolic terms such as suddenincrease, steady decay, much fluctuated, and so on. By employing this method,the patterns in a temporal rule can be described based on their partial trends.The benefit of using natural language generation to represent the trends is thatall the temporal events from a set of physiological time series data could besummarised in a textual output, which helps the end user to get a global per-spective of the repetitive patterns and their temporal correlations in a massiveamount of measurements.

The linguistic description approach applied for trend characterisation is in-spired by the NLG architecture proposed by Reiter and Dale [185]. As pre-sented in Chapter 2, this architecture includes the steps of data interpretation,document planning, microplanning and realisation (See Section 2.3.2 for moredetails). This section describes the linguistic characterisation of the detectedtrends only in the data analysis phase which is a part of microplanning mod-ule. For other tasks in NLG system, the framework follows the developed NLGmethods in [185] or more recently in [119].

While extracting partial trends from time series data to represent them innatural language, the orientation of detected trends is interpreted in linguistic


Table8.2:Subjectswiththesameconditionintheirnearestrulesets.

Selectedclinicalcondition

No.ofSubjectsNo.ofnearest

conditionswithsamelabel

percentage(%)

CHF131185%

MI6583%

Resp.failure10660%

Sepsis4375%

numericalinformationtolinguisticcharacterisations.Theseapproachesareap-pliedtothreesetsofinformationderivedfromdataanalysismethods:partialtrends,prototypicalpatterns,andtemporalrulesofpatterns.Itisnotablethattheproposedapproacheshereareheuristicandarethefirstattempttogeneratenaturallanguagetextforderivednumericalinformation.ItwillbeshowninChapter9thathowsemanticrepresentationscanbeinvolvedintheprocessoftextgenerationinordertoenrichthefinaldescriptionofsuchinformation.

8.2.1TrendandPatternDescription

Atextgenerationmethodproposedin[29]providesaframeworktodetectandrepresentpartialtrendsinsequentialpatterns.Themethodfirstdetectsthepartialtrendsofaninputtimeseriesbasedontheirnumericfeaturessuchasslopeandduration.Thenitcharacterisesthepartialtrendsinatextualformusingamappingfunctionbetweennumericandsymbolictermssuchassuddenincrease,steadydecay,muchfluctuated,andsoon.Byemployingthismethod,thepatternsinatemporalrulecanbedescribedbasedontheirpartialtrends.Thebenefitofusingnaturallanguagegenerationtorepresentthetrendsisthatallthetemporaleventsfromasetofphysiologicaltimeseriesdatacouldbesummarisedinatextualoutput,whichhelpstheendusertogetaglobalper-spectiveoftherepetitivepatternsandtheirtemporalcorrelationsinamassiveamountofmeasurements.

Thelinguisticdescriptionapproachappliedfortrendcharacterisationisin-spiredbytheNLGarchitectureproposedbyReiterandDale[185].Aspre-sentedinChapter2,thisarchitectureincludesthestepsofdatainterpretation,documentplanning,microplanningandrealisation(SeeSection2.3.2formoredetails).Thissectiondescribesthelinguisticcharacterisationofthedetectedtrendsonlyinthedataanalysisphasewhichisapartofmicroplanningmod-ule.ForothertasksinNLGsystem,theframeworkfollowsthedevelopedNLGmethodsin[185]ormorerecentlyin[119].

Whileextractingpartialtrendsfromtimeseriesdatatorepresenttheminnaturallanguage,theorientationofdetectedtrendsisinterpretedinlinguistic

8.2. LINGUISTIC DESCRIPTIONS 133

Table 8.3: The instances of linguistic terms used for describing trends.

��RangeDuration

Short Medium Long

Small steadily,nor-mally

slowly

Medium gradually Adverb

Big suddenly sharply regularly

rise,drop

increase,decrease,recover

- Verb

terms. For this reason, two following features of each trend are considered:(1) the duration of trend and (2) the range of values that trend belongs to. Tomeet the requirements of the end user and domain specificity, the system usesa fuzzy granulation. A heuristic method is used to map between the mentionedfeatures and the linguistic terms considering the following behaviours of trends:the duration of the trend to be represented (short, medium, long), and the rangeof trend would be represented (small, medium, big). Note that depending onsome criteria like the goal of the system, the end user’s needs and the type ofinput health parameter, the function of identifying these terms may vary.

With this categorisation, the system can fetch the linguistic terms to describeeach trend (with specified duration and range) in natural language sentences.These sentences include particular portions such as subject, verb, adverb etc.which have to be clarified by the system. An example of defined lexicons for thetrend’s behaviour is illustrated in Table 8.3 which includes a set of suggestionsfor the proper verbs and adverbs in each combination of specified duration andrange. Figure 8.6 shows some instances of linguistic terms for extracted trendsin HR (top) and RR (down) signals.

8.2.2 Temporal Rule Representation

One descriptive way of representing the rules is to generate a textual repre-sentation of them for the end user of the system. A simple representation ofa typical rule, r : A ⇒ B in natural language text is to put the relation andthe definition of the itemsets as the antecedent and consequent in a textual for-mat such as: “When (If/while) A occurs (happens), then (after that, at the sametime) B will occur”. For instance, in the example of market basket [207], a rulecould be explained as: “Customers who buy bread and cheese are likely to buymilk.” The main challenge in the textual representation of the temporal rulesr : A

ρ=⇒ B is to involve the temporal relation (ρ) into the rule representation.

8.2.LINGUISTICDESCRIPTIONS133

Table8.3:Theinstancesoflinguistictermsusedfordescribingtrends.

�� RangeDuration

ShortMediumLong

Smallsteadily,nor-mally

slowly

MediumgraduallyAdverb

Bigsuddenlysharplyregularly

rise,drop


-Verb

terms.Forthisreason,twofollowingfeaturesofeachtrendareconsidered:(1)thedurationoftrendand(2)therangeofvaluesthattrendbelongsto.Tomeettherequirementsoftheenduseranddomainspecificity,thesystemusesafuzzygranulation.Aheuristicmethodisusedtomapbetweenthementionedfeaturesandthelinguistictermsconsideringthefollowingbehavioursoftrends:thedurationofthetrendtoberepresented(short,medium,long),andtherangeoftrendwouldberepresented(small,medium,big).Notethatdependingonsomecriterialikethegoalofthesystem,theenduser’sneedsandthetypeofinputhealthparameter,thefunctionofidentifyingthesetermsmayvary.

Withthiscategorisation,thesystemcanfetchthelinguistictermstodescribeeachtrend(withspecifieddurationandrange)innaturallanguagesentences.Thesesentencesincludeparticularportionssuchassubject,verb,adverbetc.whichhavetobeclarifiedbythesystem.Anexampleofdefinedlexiconsforthetrend’sbehaviourisillustratedinTable8.3whichincludesasetofsuggestionsfortheproperverbsandadverbsineachcombinationofspecifieddurationandrange.Figure8.6showssomeinstancesoflinguistictermsforextractedtrendsinHR(top)andRR(down)signals.

8.2.2TemporalRuleRepresentation

Onedescriptivewayofrepresentingtherulesistogenerateatextualrepre-sentationofthemfortheenduserofthesystem.Asimplerepresentationofatypicalrule,r:A⇒Binnaturallanguagetextistoputtherelationandthedefinitionoftheitemsetsastheantecedentandconsequentinatextualfor-matsuchas:“When(If/while)Aoccurs(happens),then(afterthat,atthesametime)Bwilloccur”.Forinstance,intheexampleofmarketbasket[207],arulecouldbeexplainedas:“Customerswhobuybreadandcheesearelikelytobuymilk.”Themainchallengeinthetextualrepresentationofthetemporalrulesr:A

ρ=⇒Bistoinvolvethetemporalrelation(ρ)intotherulerepresentation.


Table 8.3: The instances of linguistic terms used for describing trends.

��RangeDuration

Short Medium Long

Small steadily,nor-mally

slowly

Medium gradually Adverb

Big suddenly sharply regularly

rise,drop


- Verb

terms. For this reason, two following features of each trend are considered:(1) the duration of trend and (2) the range of values that trend belongs to. Tomeet the requirements of the end user and domain specificity, the system usesa fuzzy granulation. A heuristic method is used to map between the mentionedfeatures and the linguistic terms considering the following behaviours of trends:the duration of the trend to be represented (short, medium, long), and the rangeof trend would be represented (small, medium, big). Note that depending onsome criteria like the goal of the system, the end user’s needs and the type ofinput health parameter, the function of identifying these terms may vary.

With this categorisation, the system can fetch the linguistic terms to describeeach trend (with specified duration and range) in natural language sentences.These sentences include particular portions such as subject, verb, adverb etc.which have to be clarified by the system. An example of defined lexicons for thetrend’s behaviour is illustrated in Table 8.3 which includes a set of suggestionsfor the proper verbs and adverbs in each combination of specified duration andrange. Figure 8.6 shows some instances of linguistic terms for extracted trendsin HR (top) and RR (down) signals.

8.2.2 Temporal Rule Representation

One descriptive way of representing the rules is to generate a textual repre-sentation of them for the end user of the system. A simple representation ofa typical rule, r : A ⇒ B in natural language text is to put the relation andthe definition of the itemsets as the antecedent and consequent in a textual for-mat such as: “When (If/while) A occurs (happens), then (after that, at the sametime) B will occur”. For instance, in the example of market basket [207], a rulecould be explained as: “Customers who buy bread and cheese are likely to buymilk.” The main challenge in the textual representation of the temporal rulesr : A

ρ=⇒ B is to involve the temporal relation (ρ) into the rule representation.


Table8.3:Theinstancesoflinguistictermsusedfordescribingtrends.

�� RangeDuration

ShortMediumLong

Smallsteadily,nor-mally

slowly

MediumgraduallyAdverb

Bigsuddenlysharplyregularly

rise,drop


-Verb

terms.Forthisreason,twofollowingfeaturesofeachtrendareconsidered:(1)thedurationoftrendand(2)therangeofvaluesthattrendbelongsto.Tomeettherequirementsoftheenduseranddomainspecificity,thesystemusesafuzzygranulation.Aheuristicmethodisusedtomapbetweenthementionedfeaturesandthelinguistictermsconsideringthefollowingbehavioursoftrends:thedurationofthetrendtoberepresented(short,medium,long),andtherangeoftrendwouldberepresented(small,medium,big).Notethatdependingonsomecriterialikethegoalofthesystem,theenduser’sneedsandthetypeofinputhealthparameter,thefunctionofidentifyingthesetermsmayvary.

Withthiscategorisation,thesystemcanfetchthelinguistictermstodescribeeachtrend(withspecifieddurationandrange)innaturallanguagesentences.Thesesentencesincludeparticularportionssuchassubject,verb,adverbetc.whichhavetobeclarifiedbythesystem.Anexampleofdefinedlexiconsforthetrend’sbehaviourisillustratedinTable8.3whichincludesasetofsuggestionsfortheproperverbsandadverbsineachcombinationofspecifieddurationandrange.Figure8.6showssomeinstancesoflinguistictermsforextractedtrendsinHR(top)andRR(down)signals.

8.2.2TemporalRuleRepresentation

Onedescriptivewayofrepresentingtherulesistogenerateatextualrepre-sentationofthemfortheenduserofthesystem.Asimplerepresentationofatypicalrule,r:A⇒Binnaturallanguagetextistoputtherelationandthedefinitionoftheitemsetsastheantecedentandconsequentinatextualfor-matsuchas:“When(If/while)Aoccurs(happens),then(afterthat,atthesametime)Bwilloccur”.Forinstance,intheexampleofmarketbasket[207],arulecouldbeexplainedas:“Customerswhobuybreadandcheesearelikelytobuymilk.”Themainchallengeinthetextualrepresentationofthetemporalrulesr:A

ρ=⇒Bistoinvolvethetemporalrelation(ρ)intotherulerepresentation.


Figure 8.6: The output of the partial trends for two segmented time series (HRon top and RR on bottom), shown in Figure 7.5.

As mentioned before, for two subsequences of patterns P1 and P2 with the re-lation ρ, both temporal rules P1

ρ=⇒ P2 and P2

ρ=⇒ P1 can be generated through

the association rule mining method. Although the temporal relation is same forthese two rules, the meaning and interpretation of them are practically differ-ent, because the roles of antecedent and consequent have been swapped. Forthis reason, a linguistic mapping from temporal rules to their messages shouldbe defined. Table 8.4 illustrates a mapping for these two rules with the itemsetsP1 and P2 while considering the defined temporal relations ρ ∈ {‘equal’, ‘before’,‘after’}.

Another challenge of textual rule representation while dealing with pat-terns is how to explain the antecedent and consequent patterns as temporalsubsequences in a meaningful way. Since a data-driven approach extracts theprototypical patterns, there is no predefined characterisation of them by theexpert. Therefore, a linguistic description of the prototypical patterns of thesubsequences P1 and P2 should be generated in the place-holders denoted by[P1] and [P2] in Table 8.4. For instance, an output text like “After a gradualdecrease in pattern P1, then pattern P2 has a big rise and then a sharp drop”is understandable, to interpret the behaviour of patterns in discovered rules forthe end user.


Figure8.6:Theoutputofthepartialtrendsfortwosegmentedtimeseries(HRontopandRRonbottom),showninFigure7.5.

Asmentionedbefore,fortwosubsequencesofpatternsP1andP2withthere-lationρ,bothtemporalrulesP1

ρ=⇒P2andP2

ρ=⇒P1canbegeneratedthrough

theassociationruleminingmethod.Althoughthetemporalrelationissameforthesetworules,themeaningandinterpretationofthemarepracticallydiffer-ent,becausetherolesofantecedentandconsequenthavebeenswapped.Forthisreason,alinguisticmappingfromtemporalrulestotheirmessagesshouldbedefined.Table8.4illustratesamappingforthesetworuleswiththeitemsetsP1andP2whileconsideringthedefinedtemporalrelationsρ∈{‘equal’,‘before’,‘after’}.

Anotherchallengeoftextualrulerepresentationwhiledealingwithpat-ternsishowtoexplaintheantecedentandconsequentpatternsastemporalsubsequencesinameaningfulway.Sinceadata-drivenapproachextractstheprototypicalpatterns,thereisnopredefinedcharacterisationofthembytheexpert.Therefore,alinguisticdescriptionoftheprototypicalpatternsofthesubsequencesP1andP2shouldbegeneratedintheplace-holdersdenotedby[P1]and[P2]inTable8.4.Forinstance,anoutputtextlike“AfteragradualdecreaseinpatternP1,thenpatternP2hasabigriseandthenasharpdrop”isunderstandable,tointerpretthebehaviourofpatternsindiscoveredrulesfortheenduser.


Figure 8.6: The output of the partial trends for two segmented time series (HRon top and RR on bottom), shown in Figure 7.5.

As mentioned before, for two subsequences of patterns P1 and P2 with the re-lation ρ, both temporal rules P1

ρ=⇒ P2 and P2

ρ=⇒ P1 can be generated through

the association rule mining method. Although the temporal relation is same forthese two rules, the meaning and interpretation of them are practically differ-ent, because the roles of antecedent and consequent have been swapped. Forthis reason, a linguistic mapping from temporal rules to their messages shouldbe defined. Table 8.4 illustrates a mapping for these two rules with the itemsetsP1 and P2 while considering the defined temporal relations ρ ∈ {‘equal’, ‘before’,‘after’}.

Another challenge of textual rule representation while dealing with pat-terns is how to explain the antecedent and consequent patterns as temporalsubsequences in a meaningful way. Since a data-driven approach extracts theprototypical patterns, there is no predefined characterisation of them by theexpert. Therefore, a linguistic description of the prototypical patterns of thesubsequences P1 and P2 should be generated in the place-holders denoted by[P1] and [P2] in Table 8.4. For instance, an output text like “After a gradualdecrease in pattern P1, then pattern P2 has a big rise and then a sharp drop”is understandable, to interpret the behaviour of patterns in discovered rules forthe end user.


Figure8.6:Theoutputofthepartialtrendsfortwosegmentedtimeseries(HRontopandRRonbottom),showninFigure7.5.

Asmentionedbefore,fortwosubsequencesofpatternsP1andP2withthere-lationρ,bothtemporalrulesP1

ρ=⇒P2andP2

ρ=⇒P1canbegeneratedthrough

theassociationruleminingmethod.Althoughthetemporalrelationissameforthesetworules,themeaningandinterpretationofthemarepracticallydiffer-ent,becausetherolesofantecedentandconsequenthavebeenswapped.Forthisreason,alinguisticmappingfromtemporalrulestotheirmessagesshouldbedefined.Table8.4illustratesamappingforthesetworuleswiththeitemsetsP1andP2whileconsideringthedefinedtemporalrelationsρ∈{‘equal’,‘before’,‘after’}.

Anotherchallengeoftextualrulerepresentationwhiledealingwithpat-ternsishowtoexplaintheantecedentandconsequentpatternsastemporalsubsequencesinameaningfulway.Sinceadata-drivenapproachextractstheprototypicalpatterns,thereisnopredefinedcharacterisationofthembytheexpert.Therefore,alinguisticdescriptionoftheprototypicalpatternsofthesubsequencesP1andP2shouldbegeneratedintheplace-holdersdenotedby[P1]and[P2]inTable8.4.Forinstance,anoutputtextlike“AfteragradualdecreaseinpatternP1,thenpatternP2hasabigriseandthenasharpdrop”isunderstandable,tointerpretthebehaviourofpatternsindiscoveredrulesfortheenduser.


Table 8.4: A template-based textual representation of rules with temporal rela-tions.

P1ρ=⇒ P2 P2

ρ=⇒ P1

P1equalsP2when [P1], at the same time

[P2].when [P2], at the same time

[P1].

P1beforeP2 when [P1], after that [P2].when [P2], before that [P1].

P1afterP2 when [P1], before that [P2]. when [P2], after that [P1].

Textual Representation of Temporal Rules

As described in Section 8.2.2, the significant tasks in temporal rule representa-tion are 1) characterising the main trends in each of antecedent and consequentas patterns and 2) realising the form of temporal relation in a provided rule(Table 8.4). The partial trends in the patterns of a temporal rule are textuallyrepresented based on their numeric features and conduct, which is proposedin Section 7.2. The linguistic demonstrations of the temporal relation betweenthe antecedent (here, HR) and consequent (here, BP and RR) of a rule are pro-vided by a variety of words which are employed from expert knowledge. Forinstance, the equal relation is presented with the terms “at the same time, si-multaneously, concurrently, etc.” or the relations before and after are shownwith “before that, earlier, just after that, later, afterwards, etc.”. Moreover, thestrength of a temporal rule based on its support and confidence values can bealso represented in the corresponding sentence. It provides a meaningful im-pression on the rule strength for the reader of the textual messages. Variousterms and phrases for the values of support and confidence can be used. As anexample the sentence of a temporal rule with a high confidence value is startedwith the terms like: “most of the time” or “commonly”. In this work, sincethe rules are generated to show the sequential happenings in the entire data,the general conditional (if-then) sentence is implemented to characterise the an-tecedent and consequent of rules. It is worth to note that to make the final textmore natural, different templates of conditional sentences have been applied(e.g. using “when” or “after”, instead of “if”).

Table 8.5 shows the generated textual representation of the acquired tem-poral rules in Figure 8.4. In these output examples, each sentence describes adiscovered temporal rule to specify the temporal relation inside the rule withthe partial trends in each of the appeared prototypical patterns, followed bythe corresponding clinical condition. An advantage of generating final outputin natural language is a textual description that is understandable and inter-pretable by the end user of the system.


Table8.4:Atemplate-basedtextualrepresentationofruleswithtemporalrela-tions.

P1ρ=⇒P2P2

ρ=⇒P1

P1equalsP2when[P1],atthesametime

[P2].when[P2],atthesametime

[P1].

P1beforeP2when[P1],afterthat[P2].when[P2],beforethat[P1].

P1afterP2when[P1],beforethat[P2].when[P2],afterthat[P1].

TextualRepresentationofTemporalRules

AsdescribedinSection8.2.2,thesignificanttasksintemporalrulerepresenta-tionare1)characterisingthemaintrendsineachofantecedentandconsequentaspatternsand2)realisingtheformoftemporalrelationinaprovidedrule(Table8.4).Thepartialtrendsinthepatternsofatemporalrulearetextuallyrepresentedbasedontheirnumericfeaturesandconduct,whichisproposedinSection7.2.Thelinguisticdemonstrationsofthetemporalrelationbetweentheantecedent(here,HR)andconsequent(here,BPandRR)ofarulearepro-videdbyavarietyofwordswhichareemployedfromexpertknowledge.Forinstance,theequalrelationispresentedwiththeterms“atthesametime,si-multaneously,concurrently,etc.”ortherelationsbeforeandafterareshownwith“beforethat,earlier,justafterthat,later,afterwards,etc.”.Moreover,thestrengthofatemporalrulebasedonitssupportandconfidencevaluescanbealsorepresentedinthecorrespondingsentence.Itprovidesameaningfulim-pressionontherulestrengthforthereaderofthetextualmessages.Varioustermsandphrasesforthevaluesofsupportandconfidencecanbeused.Asanexamplethesentenceofatemporalrulewithahighconfidencevalueisstartedwiththetermslike:“mostofthetime”or“commonly”.Inthiswork,sincetherulesaregeneratedtoshowthesequentialhappeningsintheentiredata,thegeneralconditional(if-then)sentenceisimplementedtocharacterisethean-tecedentandconsequentofrules.Itisworthtonotethattomakethefinaltextmorenatural,differenttemplatesofconditionalsentenceshavebeenapplied(e.g.using“when”or“after”,insteadof“if”).

Table8.5showsthegeneratedtextualrepresentationoftheacquiredtem-poralrulesinFigure8.4.Intheseoutputexamples,eachsentencedescribesadiscoveredtemporalruletospecifythetemporalrelationinsidetherulewiththepartialtrendsineachoftheappearedprototypicalpatterns,followedbythecorrespondingclinicalcondition.Anadvantageofgeneratingfinaloutputinnaturallanguageisatextualdescriptionthatisunderstandableandinter-pretablebytheenduserofthesystem.


Table 8.4: A template-based textual representation of rules with temporal rela-tions.

P1ρ=⇒ P2 P2

ρ=⇒ P1

P1equalsP2when [P1], at the same time

[P2].when [P2], at the same time

[P1].

P1beforeP2 when [P1], after that [P2].when [P2], before that [P1].

P1afterP2 when [P1], before that [P2]. when [P2], after that [P1].

Textual Representation of Temporal Rules

As described in Section 8.2.2, the significant tasks in temporal rule representa-tion are 1) characterising the main trends in each of antecedent and consequentas patterns and 2) realising the form of temporal relation in a provided rule(Table 8.4). The partial trends in the patterns of a temporal rule are textuallyrepresented based on their numeric features and conduct, which is proposedin Section 7.2. The linguistic demonstrations of the temporal relation betweenthe antecedent (here, HR) and consequent (here, BP and RR) of a rule are pro-vided by a variety of words which are employed from expert knowledge. Forinstance, the equal relation is presented with the terms “at the same time, si-multaneously, concurrently, etc.” or the relations before and after are shownwith “before that, earlier, just after that, later, afterwards, etc.”. Moreover, thestrength of a temporal rule based on its support and confidence values can bealso represented in the corresponding sentence. It provides a meaningful im-pression on the rule strength for the reader of the textual messages. Variousterms and phrases for the values of support and confidence can be used. As anexample the sentence of a temporal rule with a high confidence value is startedwith the terms like: “most of the time” or “commonly”. In this work, sincethe rules are generated to show the sequential happenings in the entire data,the general conditional (if-then) sentence is implemented to characterise the an-tecedent and consequent of rules. It is worth to note that to make the final textmore natural, different templates of conditional sentences have been applied(e.g. using “when” or “after”, instead of “if”).

Table 8.5 shows the generated textual representation of the acquired tem-poral rules in Figure 8.4. In these output examples, each sentence describes adiscovered temporal rule to specify the temporal relation inside the rule withthe partial trends in each of the appeared prototypical patterns, followed bythe corresponding clinical condition. An advantage of generating final outputin natural language is a textual description that is understandable and inter-pretable by the end user of the system.


Table8.4:Atemplate-basedtextualrepresentationofruleswithtemporalrela-tions.

P1ρ=⇒P2P2

ρ=⇒P1

P1equalsP2when[P1],atthesametime

[P2].when[P2],atthesametime

[P1].

P1beforeP2when[P1],afterthat[P2].when[P2],beforethat[P1].

P1afterP2when[P1],beforethat[P2].when[P2],afterthat[P1].

TextualRepresentationofTemporalRules

AsdescribedinSection8.2.2,thesignificanttasksintemporalrulerepresenta-tionare1)characterisingthemaintrendsineachofantecedentandconsequentaspatternsand2)realisingtheformoftemporalrelationinaprovidedrule(Table8.4).Thepartialtrendsinthepatternsofatemporalrulearetextuallyrepresentedbasedontheirnumericfeaturesandconduct,whichisproposedinSection7.2.Thelinguisticdemonstrationsofthetemporalrelationbetweentheantecedent(here,HR)andconsequent(here,BPandRR)ofarulearepro-videdbyavarietyofwordswhichareemployedfromexpertknowledge.Forinstance,theequalrelationispresentedwiththeterms“atthesametime,si-multaneously,concurrently,etc.”ortherelationsbeforeandafterareshownwith“beforethat,earlier,justafterthat,later,afterwards,etc.”.Moreover,thestrengthofatemporalrulebasedonitssupportandconfidencevaluescanbealsorepresentedinthecorrespondingsentence.Itprovidesameaningfulim-pressionontherulestrengthforthereaderofthetextualmessages.Varioustermsandphrasesforthevaluesofsupportandconfidencecanbeused.Asanexamplethesentenceofatemporalrulewithahighconfidencevalueisstartedwiththetermslike:“mostofthetime”or“commonly”.Inthiswork,sincetherulesaregeneratedtoshowthesequentialhappeningsintheentiredata,thegeneralconditional(if-then)sentenceisimplementedtocharacterisethean-tecedentandconsequentofrules.Itisworthtonotethattomakethefinaltextmorenatural,differenttemplatesofconditionalsentenceshavebeenapplied(e.g.using“when”or“after”,insteadof“if”).

Table8.5showsthegeneratedtextualrepresentationoftheacquiredtem-poralrulesinFigure8.4.Intheseoutputexamples,eachsentencedescribesadiscoveredtemporalruletospecifythetemporalrelationinsidetherulewiththepartialtrendsineachoftheappearedprototypicalpatterns,followedbythecorrespondingclinicalcondition.Anadvantageofgeneratingfinaloutputinnaturallanguageisatextualdescriptionthatisunderstandableandinter-pretablebytheenduserofthesystem.


Table 8.5: Textual representation of the acquired rules in Figure 8.4.

Rules Linguistic Description

Fig. 8.4 (a)In the Bleed condition, occasionally when the heart ratenormally rises (7 beats) and steadily decreases (4 beats), atthe same time the blood pressure normally rises (10 units).

Fig. 8.4 (b)In the MI condition, usually when the respiration rate de-cays and then rises in a very small range, simultaneouslythe heart rate decays and rises very slowly.

Fig. 8.4 (c)In the Angina condition, most frequently if the heart ratesharply decreases (10 beats) and suddenly rises (13 beats),later, the blood pressure reduces very slowly.

Fig. 8.4 (d)

In the Post-op Valve condition, most of the time the respira-tion rate sharply increases (7 breaths) and steadily reduces(3 breaths, just before that, the heart rate decreases in avery small range.

Fig. 8.4 (e)In the MI condition, usually after the heart rate steadily in-creases (5 beats) and normally reduces (2 beats), the bloodpressure fluctuates in a very small range.

Fig. 8.4 (f)In the Sepsis condition, most of the time before the respira-tion rate normally rises (2 breaths) and suddenly decreases(5 breaths), the heart rate steadily decreases (6 breaths).


The approach introduced in the first section of this chapter presents a descrip-tive model of temporal rule mining to generate meaningful rules for physiologi-cal sensor data in a clinical setting. This modelling also underlies the uniquenessof the rule sets for considering cases, which means each provided rule set con-tains distinct rules that are unique to their model. The advantage of providingdistinctive rules for clinical conditions is to enable physicians to discover spe-cific behaviours of vital signs, which are not necessarily recorded in medicalontologies. The proposed approach is able to exploit unseen and distinctive in-formation per patient or condition. This information can assist the clinicians inindividual decision making.

The second section of this chapter introduces the approaches to describethe mind information linguistically. First, it is shown how linguistic terms canannotate the partial trends and prototypical patterns (extracted in data analysisphase in Chapter 7). Then, with the use of the annotated trends and patterns, a


Table8.5:TextualrepresentationoftheacquiredrulesinFigure8.4.

RulesLinguisticDescription

Fig.8.4(a)IntheBleedcondition,occasionallywhentheheartratenormallyrises(7beats)andsteadilydecreases(4beats),atthesametimethebloodpressurenormallyrises(10units).

Fig.8.4(b)IntheMIcondition,usuallywhentherespirationratede-caysandthenrisesinaverysmallrange,simultaneouslytheheartratedecaysandrisesveryslowly.

Fig.8.4(c)IntheAnginacondition,mostfrequentlyiftheheartratesharplydecreases(10beats)andsuddenlyrises(13beats),later,thebloodpressurereducesveryslowly.

Fig.8.4(d)

InthePost-opValvecondition,mostofthetimetherespira-tionratesharplyincreases(7breaths)andsteadilyreduces(3breaths,justbeforethat,theheartratedecreasesinaverysmallrange.

Fig.8.4(e)IntheMIcondition,usuallyaftertheheartratesteadilyin-creases(5beats)andnormallyreduces(2beats),thebloodpressurefluctuatesinaverysmallrange.

Fig.8.4(f)IntheSepsiscondition,mostofthetimebeforetherespira-tionratenormallyrises(2breaths)andsuddenlydecreases(5breaths),theheartratesteadilydecreases(6breaths).


Theapproachintroducedinthefirstsectionofthischapterpresentsadescrip-tivemodeloftemporalruleminingtogeneratemeaningfulrulesforphysiologi-calsensordatainaclinicalsetting.Thismodellingalsounderliestheuniquenessoftherulesetsforconsideringcases,whichmeanseachprovidedrulesetcon-tainsdistinctrulesthatareuniquetotheirmodel.Theadvantageofprovidingdistinctiverulesforclinicalconditionsistoenablephysicianstodiscoverspe-cificbehavioursofvitalsigns,whicharenotnecessarilyrecordedinmedicalontologies.Theproposedapproachisabletoexploitunseenanddistinctivein-formationperpatientorcondition.Thisinformationcanassistthecliniciansinindividualdecisionmaking.

Thesecondsectionofthischapterintroducestheapproachestodescribethemindinformationlinguistically.First,itisshownhowlinguistictermscanannotatethepartialtrendsandprototypicalpatterns(extractedindataanalysisphaseinChapter7).Then,withtheuseoftheannotatedtrendsandpatterns,a


Table 8.5: Textual representation of the acquired rules in Figure 8.4.

Rules Linguistic Description

Fig. 8.4 (a)In the Bleed condition, occasionally when the heart ratenormally rises (7 beats) and steadily decreases (4 beats), atthe same time the blood pressure normally rises (10 units).

Fig. 8.4 (b)In the MI condition, usually when the respiration rate de-cays and then rises in a very small range, simultaneouslythe heart rate decays and rises very slowly.

Fig. 8.4 (c)In the Angina condition, most frequently if the heart ratesharply decreases (10 beats) and suddenly rises (13 beats),later, the blood pressure reduces very slowly.

Fig. 8.4 (d)

In the Post-op Valve condition, most of the time the respira-tion rate sharply increases (7 breaths) and steadily reduces(3 breaths, just before that, the heart rate decreases in avery small range.

Fig. 8.4 (e)In the MI condition, usually after the heart rate steadily in-creases (5 beats) and normally reduces (2 beats), the bloodpressure fluctuates in a very small range.

Fig. 8.4 (f)In the Sepsis condition, most of the time before the respira-tion rate normally rises (2 breaths) and suddenly decreases(5 breaths), the heart rate steadily decreases (6 breaths).


The approach introduced in the first section of this chapter presents a descrip-tive model of temporal rule mining to generate meaningful rules for physiologi-cal sensor data in a clinical setting. This modelling also underlies the uniquenessof the rule sets for considering cases, which means each provided rule set con-tains distinct rules that are unique to their model. The advantage of providingdistinctive rules for clinical conditions is to enable physicians to discover spe-cific behaviours of vital signs, which are not necessarily recorded in medicalontologies. The proposed approach is able to exploit unseen and distinctive in-formation per patient or condition. This information can assist the clinicians inindividual decision making.

The second section of this chapter introduces the approaches to describethe mind information linguistically. First, it is shown how linguistic terms canannotate the partial trends and prototypical patterns (extracted in data analysisphase in Chapter 7). Then, with the use of the annotated trends and patterns, a


Table8.5:TextualrepresentationoftheacquiredrulesinFigure8.4.

RulesLinguisticDescription

Fig.8.4(a)IntheBleedcondition,occasionallywhentheheartratenormallyrises(7beats)andsteadilydecreases(4beats),atthesametimethebloodpressurenormallyrises(10units).

Fig.8.4(b)IntheMIcondition,usuallywhentherespirationratede-caysandthenrisesinaverysmallrange,simultaneouslytheheartratedecaysandrisesveryslowly.

Fig.8.4(c)IntheAnginacondition,mostfrequentlyiftheheartratesharplydecreases(10beats)andsuddenlyrises(13beats),later,thebloodpressurereducesveryslowly.

Fig.8.4(d)

InthePost-opValvecondition,mostofthetimetherespira-tionratesharplyincreases(7breaths)andsteadilyreduces(3breaths,justbeforethat,theheartratedecreasesinaverysmallrange.

Fig.8.4(e)IntheMIcondition,usuallyaftertheheartratesteadilyin-creases(5beats)andnormallyreduces(2beats),thebloodpressurefluctuatesinaverysmallrange.

Fig.8.4(f)IntheSepsiscondition,mostofthetimebeforetherespira-tionratenormallyrises(2breaths)andsuddenlydecreases(5breaths),theheartratesteadilydecreases(6breaths).


Theapproachintroducedinthefirstsectionofthischapterpresentsadescrip-tivemodeloftemporalruleminingtogeneratemeaningfulrulesforphysiologi-calsensordatainaclinicalsetting.Thismodellingalsounderliestheuniquenessoftherulesetsforconsideringcases,whichmeanseachprovidedrulesetcon-tainsdistinctrulesthatareuniquetotheirmodel.Theadvantageofprovidingdistinctiverulesforclinicalconditionsistoenablephysicianstodiscoverspe-cificbehavioursofvitalsigns,whicharenotnecessarilyrecordedinmedicalontologies.Theproposedapproachisabletoexploitunseenanddistinctivein-formationperpatientorcondition.Thisinformationcanassistthecliniciansinindividualdecisionmaking.

Thesecondsectionofthischapterintroducestheapproachestodescribethemindinformationlinguistically.First,itisshownhowlinguistictermscanannotatethepartialtrendsandprototypicalpatterns(extractedindataanalysisphaseinChapter7).Then,withtheuseoftheannotatedtrendsandpatterns,a


template-based approach is presented to generate linguistic descriptions for themined temporal rules involving the temporal relation of the unknown but in-teresting patterns. Although the output results are reasonably human-readabletext, the limitation of this approach is the richness of the provided annotationsand labels for the partial trends and patterns. This method has been only rely-ing on the shape and the trends of the time series patterns. So, an alternativepass to describe unknown patterns and temporal rules is to describe them by amodel that is constructed on the basis of the known annotated patterns. Seman-tic representations that are presented in Part I can play this role in a linguisticdescription method.

In sum, this chapter provides a more in-depth analysis of the extracted pat-terns of physiological data to find temporal rules among those patterns. After,it presented linguistic description approaches to turn patterns and rules intonatural language texts. Chapter 9 shows how a semantic representation canhelp the linguistic description approaches to enrich the final text, and how itcan help to describe further (and possibly unknown) observations.


template-basedapproachispresentedtogeneratelinguisticdescriptionsfortheminedtemporalrulesinvolvingthetemporalrelationoftheunknownbutin-terestingpatterns.Althoughtheoutputresultsarereasonablyhuman-readabletext,thelimitationofthisapproachistherichnessoftheprovidedannotationsandlabelsforthepartialtrendsandpatterns.Thismethodhasbeenonlyrely-ingontheshapeandthetrendsofthetimeseriespatterns.So,analternativepasstodescribeunknownpatternsandtemporalrulesistodescribethembyamodelthatisconstructedonthebasisoftheknownannotatedpatterns.Seman-ticrepresentationsthatarepresentedinPartIcanplaythisroleinalinguisticdescriptionmethod.

Insum,thischapterprovidesamorein-depthanalysisoftheextractedpat-ternsofphysiologicaldatatofindtemporalrulesamongthosepatterns.After,itpresentedlinguisticdescriptionapproachestoturnpatternsandrulesintonaturallanguagetexts.Chapter9showshowasemanticrepresentationcanhelpthelinguisticdescriptionapproachestoenrichthefinaltext,andhowitcanhelptodescribefurther(andpossiblyunknown)observations.


template-based approach is presented to generate linguistic descriptions for themined temporal rules involving the temporal relation of the unknown but in-teresting patterns. Although the output results are reasonably human-readabletext, the limitation of this approach is the richness of the provided annotationsand labels for the partial trends and patterns. This method has been only rely-ing on the shape and the trends of the time series patterns. So, an alternativepass to describe unknown patterns and temporal rules is to describe them by amodel that is constructed on the basis of the known annotated patterns. Seman-tic representations that are presented in Part I can play this role in a linguisticdescription method.

In sum, this chapter provides a more in-depth analysis of the extracted pat-terns of physiological data to find temporal rules among those patterns. After,it presented linguistic description approaches to turn patterns and rules intonatural language texts. Chapter 9 shows how a semantic representation canhelp the linguistic description approaches to enrich the final text, and how itcan help to describe further (and possibly unknown) observations.


template-basedapproachispresentedtogeneratelinguisticdescriptionsfortheminedtemporalrulesinvolvingthetemporalrelationoftheunknownbutin-terestingpatterns.Althoughtheoutputresultsarereasonablyhuman-readabletext,thelimitationofthisapproachistherichnessoftheprovidedannotationsandlabelsforthepartialtrendsandpatterns.Thismethodhasbeenonlyrely-ingontheshapeandthetrendsofthetimeseriespatterns.So,analternativepasstodescribeunknownpatternsandtemporalrulesistodescribethembyamodelthatisconstructedonthebasisoftheknownannotatedpatterns.Seman-ticrepresentationsthatarepresentedinPartIcanplaythisroleinalinguisticdescriptionmethod.

Insum,thischapterprovidesamorein-depthanalysisoftheextractedpat-ternsofphysiologicaldatatofindtemporalrulesamongthosepatterns.After,itpresentedlinguisticdescriptionapproachestoturnpatternsandrulesintonaturallanguagetexts.Chapter9showshowasemanticrepresentationcanhelpthelinguisticdescriptionapproachestoenrichthefinaltext,andhowitcanhelptodescribefurther(andpossiblyunknown)observations.

Chapter 9

Linguistic Descriptions for

Time Series Patterns using

Conceptual Spaces

“Long before worrying about how to convince others, youfirst have to understand what’s happening yourself.”

— Andrew Gelman (1965–)

This chapter presents the application of the proposed semantic representa-tions in Part I in the area of physiological sensor data. As mentioned in

Part I, the input of a semantic representation needs to be a set of perceivedinformation with certain attributes (namely involving labels and features forobservations). This thesis has employed data analysis approaches to extractmeaningful and interesting information. These approaches have already beenwidely discussed in chapters 7 and 8. The abstracted patterns from sensor dataare now used in a semantic representation, and then be utilised to infer linguis-tic descriptions, as will be presented in this chapter. The distinctive approachpresented here is the ability of modelling and interpreting new observationsthat are could be unknown (i.e., not pre-defined) for the system.

Within the field of time series data mining, the perception-based analysis ofpatterns attempts to formalise knowledge and simulate human reasoning [35].Linguistic descriptions can represent the perceptions (i.e., words such as low,increasing, most of the time, etc.) of time series patterns. A time series pat-tern is a subsequence of a univariate time series containing a meaningful be-haviour or trend of the data. Many studies consider the problem of qualitativeanalysis of time series patterns and its manipulation with linguistic informa-tion [32, 36, 124, 165, 237]. However, in most of them, the required linguistic

139

Chapter9

LinguisticDescriptionsfor

TimeSeriesPatternsusing

ConceptualSpaces

“Longbeforeworryingabouthowtoconvinceothers,youfirsthavetounderstandwhat’shappeningyourself.”

—AndrewGelman(1965–)

Thischapterpresentstheapplicationoftheproposedsemanticrepresenta-tionsinPartIintheareaofphysiologicalsensordata.Asmentionedin

PartI,theinputofasemanticrepresentationneedstobeasetofperceivedinformationwithcertainattributes(namelyinvolvinglabelsandfeaturesforobservations).Thisthesishasemployeddataanalysisapproachestoextractmeaningfulandinterestinginformation.Theseapproacheshavealreadybeenwidelydiscussedinchapters7and8.Theabstractedpatternsfromsensordataarenowusedinasemanticrepresentation,andthenbeutilisedtoinferlinguis-ticdescriptions,aswillbepresentedinthischapter.Thedistinctiveapproachpresentedhereistheabilityofmodellingandinterpretingnewobservationsthatarecouldbeunknown(i.e.,notpre-defined)forthesystem.

Withinthefieldoftimeseriesdatamining,theperception-basedanalysisofpatternsattemptstoformaliseknowledgeandsimulatehumanreasoning[35].Linguisticdescriptionscanrepresenttheperceptions(i.e.,wordssuchaslow,increasing,mostofthetime,etc.)oftimeseriespatterns.Atimeseriespat-ternisasubsequenceofaunivariatetimeseriescontainingameaningfulbe-haviourortrendofthedata.Manystudiesconsidertheproblemofqualitativeanalysisoftimeseriespatternsanditsmanipulationwithlinguisticinforma-tion[32,36,124,165,237].However,inmostofthem,therequiredlinguistic

139

Chapter 9

Linguistic Descriptions for

Time Series Patterns using

Conceptual Spaces

“Long before worrying about how to convince others, youfirst have to understand what’s happening yourself.”

— Andrew Gelman (1965–)

This chapter presents the application of the proposed semantic representa-tions in Part I in the area of physiological sensor data. As mentioned in

Part I, the input of a semantic representation needs to be a set of perceivedinformation with certain attributes (namely involving labels and features forobservations). This thesis has employed data analysis approaches to extractmeaningful and interesting information. These approaches have already beenwidely discussed in chapters 7 and 8. The abstracted patterns from sensor dataare now used in a semantic representation, and then be utilised to infer linguis-tic descriptions, as will be presented in this chapter. The distinctive approachpresented here is the ability of modelling and interpreting new observationsthat are could be unknown (i.e., not pre-defined) for the system.

Within the field of time series data mining, the perception-based analysis ofpatterns attempts to formalise knowledge and simulate human reasoning [35].Linguistic descriptions can represent the perceptions (i.e., words such as low,increasing, most of the time, etc.) of time series patterns. A time series pat-tern is a subsequence of a univariate time series containing a meaningful be-haviour or trend of the data. Many studies consider the problem of qualitativeanalysis of time series patterns and its manipulation with linguistic informa-tion [32, 36, 124, 165, 237]. However, in most of them, the required linguistic

139

Chapter9

LinguisticDescriptionsfor

TimeSeriesPatternsusing

ConceptualSpaces

“Longbeforeworryingabouthowtoconvinceothers,youfirsthavetounderstandwhat’shappeningyourself.”

—AndrewGelman(1965–)

Thischapterpresentstheapplicationoftheproposedsemanticrepresenta-tionsinPartIintheareaofphysiologicalsensordata.Asmentionedin

PartI,theinputofasemanticrepresentationneedstobeasetofperceivedinformationwithcertainattributes(namelyinvolvinglabelsandfeaturesforobservations).Thisthesishasemployeddataanalysisapproachestoextractmeaningfulandinterestinginformation.Theseapproacheshavealreadybeenwidelydiscussedinchapters7and8.Theabstractedpatternsfromsensordataarenowusedinasemanticrepresentation,andthenbeutilisedtoinferlinguis-ticdescriptions,aswillbepresentedinthischapter.Thedistinctiveapproachpresentedhereistheabilityofmodellingandinterpretingnewobservationsthatarecouldbeunknown(i.e.,notpre-defined)forthesystem.

Withinthefieldoftimeseriesdatamining,theperception-basedanalysisofpatternsattemptstoformaliseknowledgeandsimulatehumanreasoning[35].Linguisticdescriptionscanrepresenttheperceptions(i.e.,wordssuchaslow,increasing,mostofthetime,etc.)oftimeseriespatterns.Atimeseriespat-ternisasubsequenceofaunivariatetimeseriescontainingameaningfulbe-haviourortrendofthedata.Manystudiesconsidertheproblemofqualitativeanalysisoftimeseriespatternsanditsmanipulationwithlinguisticinforma-tion[32,36,124,165,237].However,inmostofthem,therequiredlinguistic

139

140 CHAPTER 9. LINGUISTIC DESCRIPTIONS FOR PATTERNS

information is limited by the expert or domain knowledge. In other words, thedeveloped systems are designed to cover specific trends and shapes of patternswith a particular set of the requested linguistic characterisations as the generalvocabulary. For example, Yu et al. [237] developed a natural language genera-tion (NLG) framework to summarise the patterns in large time series in a fewsentences. This framework uses an ontology of patterns as general vocabularylike a spike, a step, etc. to comply with the linguistic requirements. The majordrawback of such a system is that any other observation (pattern) which is notmatched with the provided vocabulary cannot be described and reported in thefinal summary.

Here, the conceptual spaces theory is used to represent the linguistic char-acterisation of time series patterns in a semantic model. According to the pro-posed approach in Part I, this model provides a symbolic representation of bothknown and unknown patterns. This chapter shows how to construct such con-ceptual space for a given set of time series patterns, also shows how to inferencein the conceptual space for the linguistic characterisation of time series patterns.

9.1 Constructing a Conceptual Space of Time Series

Patterns

Assume that there is a set of time series patterns with varying lengths, whichare labelled with a set of linguistic terms (See Figure 9.1). Here, the same set ofphysiological patterns that are extracted from the MIMIC data set (explained inChapter 7) are used. The time series patterns are exploited from heart rate andrespiration rate recorded in several clinical conditions [32]. From this data set,78 patterns are used as known observations with varying time durations, whichare categorised (i.e., labelled by experts) into four known classes: increasing,decreasing, spike, and oscillation. These class labels of time series patterns areacquired from the common behavioural labels used in the literature of miningand linguistic characterisation of shapes and trends in time series data [35,107,165,236].

Formally, the data set and the labels of the patterns are defined as: Dp ={o1, . . . ,o78} and Yp = { yin : ‘Increasing’, yde : ‘Decreasing’, ysp : ‘Spike’,yos : ‘Oscillation’ }. Although there many other types of behaviours for timeseries patterns, these four classes are primarily chosen to simplify the processof conceptualising the patterns. Figure 9.1 shows the typical examples of suchpatterns for each of the mentioned classes of the data set. Regarding the set ofclass labels Yp, the set of concepts is defined as: Cp = {Cin, Cde, Csp, Cos}.

Initialising the primitive set of characteristic features is another input tobuild the conceptual space of time series patterns. There are many charac-teristic features for analysing and modelling time series data, from simplestatistical features to frequency related ones. As mentioned in the leaf dataset, the criterion is how describable or interpretable the features are in lin-

140CHAPTER9.LINGUISTICDESCRIPTIONSFORPATTERNS

informationislimitedbytheexpertordomainknowledge.Inotherwords,thedevelopedsystemsaredesignedtocoverspecifictrendsandshapesofpatternswithaparticularsetoftherequestedlinguisticcharacterisationsasthegeneralvocabulary.Forexample,Yuetal.[237]developedanaturallanguagegenera-tion(NLG)frameworktosummarisethepatternsinlargetimeseriesinafewsentences.Thisframeworkusesanontologyofpatternsasgeneralvocabularylikeaspike,astep,etc.tocomplywiththelinguisticrequirements.Themajordrawbackofsuchasystemisthatanyotherobservation(pattern)whichisnotmatchedwiththeprovidedvocabularycannotbedescribedandreportedinthefinalsummary.

Here,theconceptualspacestheoryisusedtorepresentthelinguisticchar-acterisationoftimeseriespatternsinasemanticmodel.Accordingtothepro-posedapproachinPartI,thismodelprovidesasymbolicrepresentationofbothknownandunknownpatterns.Thischaptershowshowtoconstructsuchcon-ceptualspaceforagivensetoftimeseriespatterns,alsoshowshowtoinferenceintheconceptualspaceforthelinguisticcharacterisationoftimeseriespatterns.

9.1ConstructingaConceptualSpaceofTimeSeries

Patterns

Assumethatthereisasetoftimeseriespatternswithvaryinglengths,whicharelabelledwithasetoflinguisticterms(SeeFigure9.1).Here,thesamesetofphysiologicalpatternsthatareextractedfromtheMIMICdataset(explainedinChapter7)areused.Thetimeseriespatternsareexploitedfromheartrateandrespirationraterecordedinseveralclinicalconditions[32].Fromthisdataset,78patternsareusedasknownobservationswithvaryingtimedurations,whicharecategorised(i.e.,labelledbyexperts)intofourknownclasses:increasing,decreasing,spike,andoscillation.Theseclasslabelsoftimeseriespatternsareacquiredfromthecommonbehaviourallabelsusedintheliteratureofminingandlinguisticcharacterisationofshapesandtrendsintimeseriesdata[35,107,165,236].

Formally,thedatasetandthelabelsofthepatternsaredefinedas:Dp=

{o1,...,o78}andYp={yin:‘Increasing’,yde:‘Decreasing’,ysp:‘Spike’,

yos:‘Oscillation’}.Althoughtheremanyothertypesofbehavioursfortimeseriespatterns,thesefourclassesareprimarilychosentosimplifytheprocessofconceptualisingthepatterns.Figure9.1showsthetypicalexamplesofsuchpatternsforeachofthementionedclassesofthedataset.RegardingthesetofclasslabelsY

p,thesetofconceptsisdefinedas:C

p={Cin,Cde,Csp,Cos}.

Initialisingtheprimitivesetofcharacteristicfeaturesisanotherinputtobuildtheconceptualspaceoftimeseriespatterns.Therearemanycharac-teristicfeaturesforanalysingandmodellingtimeseriesdata,fromsimplestatisticalfeaturestofrequencyrelatedones.Asmentionedintheleafdataset,thecriterionishowdescribableorinterpretablethefeaturesareinlin-


information is limited by the expert or domain knowledge. In other words, thedeveloped systems are designed to cover specific trends and shapes of patternswith a particular set of the requested linguistic characterisations as the generalvocabulary. For example, Yu et al. [237] developed a natural language genera-tion (NLG) framework to summarise the patterns in large time series in a fewsentences. This framework uses an ontology of patterns as general vocabularylike a spike, a step, etc. to comply with the linguistic requirements. The majordrawback of such a system is that any other observation (pattern) which is notmatched with the provided vocabulary cannot be described and reported in thefinal summary.

Here, the conceptual spaces theory is used to represent the linguistic char-acterisation of time series patterns in a semantic model. According to the pro-posed approach in Part I, this model provides a symbolic representation of bothknown and unknown patterns. This chapter shows how to construct such con-ceptual space for a given set of time series patterns, also shows how to inferencein the conceptual space for the linguistic characterisation of time series patterns.

9.1 Constructing a Conceptual Space of Time Series

Patterns

Assume that there is a set of time series patterns with varying lengths, whichare labelled with a set of linguistic terms (See Figure 9.1). Here, the same set ofphysiological patterns that are extracted from the MIMIC data set (explained inChapter 7) are used. The time series patterns are exploited from heart rate andrespiration rate recorded in several clinical conditions [32]. From this data set,78 patterns are used as known observations with varying time durations, whichare categorised (i.e., labelled by experts) into four known classes: increasing,decreasing, spike, and oscillation. These class labels of time series patterns areacquired from the common behavioural labels used in the literature of miningand linguistic characterisation of shapes and trends in time series data [35,107,165,236].

Formally, the data set and the labels of the patterns are defined as: Dp ={o1, . . . ,o78} and Yp = { yin : ‘Increasing’, yde : ‘Decreasing’, ysp : ‘Spike’,yos : ‘Oscillation’ }. Although there many other types of behaviours for timeseries patterns, these four classes are primarily chosen to simplify the processof conceptualising the patterns. Figure 9.1 shows the typical examples of suchpatterns for each of the mentioned classes of the data set. Regarding the set ofclass labels Yp, the set of concepts is defined as: Cp = {Cin, Cde, Csp, Cos}.

Initialising the primitive set of characteristic features is another input tobuild the conceptual space of time series patterns. There are many charac-teristic features for analysing and modelling time series data, from simplestatistical features to frequency related ones. As mentioned in the leaf dataset, the criterion is how describable or interpretable the features are in lin-


informationislimitedbytheexpertordomainknowledge.Inotherwords,thedevelopedsystemsaredesignedtocoverspecifictrendsandshapesofpatternswithaparticularsetoftherequestedlinguisticcharacterisationsasthegeneralvocabulary.Forexample,Yuetal.[237]developedanaturallanguagegenera-tion(NLG)frameworktosummarisethepatternsinlargetimeseriesinafewsentences.Thisframeworkusesanontologyofpatternsasgeneralvocabularylikeaspike,astep,etc.tocomplywiththelinguisticrequirements.Themajordrawbackofsuchasystemisthatanyotherobservation(pattern)whichisnotmatchedwiththeprovidedvocabularycannotbedescribedandreportedinthefinalsummary.

Here,theconceptualspacestheoryisusedtorepresentthelinguisticchar-acterisationoftimeseriespatternsinasemanticmodel.Accordingtothepro-posedapproachinPartI,thismodelprovidesasymbolicrepresentationofbothknownandunknownpatterns.Thischaptershowshowtoconstructsuchcon-ceptualspaceforagivensetoftimeseriespatterns,alsoshowshowtoinferenceintheconceptualspaceforthelinguisticcharacterisationoftimeseriespatterns.

9.1ConstructingaConceptualSpaceofTimeSeries

Patterns

Assumethatthereisasetoftimeseriespatternswithvaryinglengths,whicharelabelledwithasetoflinguisticterms(SeeFigure9.1).Here,thesamesetofphysiologicalpatternsthatareextractedfromtheMIMICdataset(explainedinChapter7)areused.Thetimeseriespatternsareexploitedfromheartrateandrespirationraterecordedinseveralclinicalconditions[32].Fromthisdataset,78patternsareusedasknownobservationswithvaryingtimedurations,whicharecategorised(i.e.,labelledbyexperts)intofourknownclasses:increasing,decreasing,spike,andoscillation.Theseclasslabelsoftimeseriespatternsareacquiredfromthecommonbehaviourallabelsusedintheliteratureofminingandlinguisticcharacterisationofshapesandtrendsintimeseriesdata[35,107,165,236].

Formally,thedatasetandthelabelsofthepatternsaredefinedas:Dp=

{o1,...,o78}andYp={yin:‘Increasing’,yde:‘Decreasing’,ysp:‘Spike’,

yos:‘Oscillation’}.Althoughtheremanyothertypesofbehavioursfortimeseriespatterns,thesefourclassesareprimarilychosentosimplifytheprocessofconceptualisingthepatterns.Figure9.1showsthetypicalexamplesofsuchpatternsforeachofthementionedclassesofthedataset.RegardingthesetofclasslabelsY

p,thesetofconceptsisdefinedas:C

p={Cin,Cde,Csp,Cos}.

Initialisingtheprimitivesetofcharacteristicfeaturesisanotherinputtobuildtheconceptualspaceoftimeseriespatterns.Therearemanycharac-teristicfeaturesforanalysingandmodellingtimeseriesdata,fromsimplestatisticalfeaturestofrequencyrelatedones.Asmentionedintheleafdataset,thecriterionishowdescribableorinterpretablethefeaturesareinlin-

9.1. CONSTRUCTING CONCEPTUAL SPACE OF PATTERNS 141

Figure 9.1: Four sets of time series patterns, presenting the known classes ofpatterns in the data set.

guistic form. For example, in time series pattern data set, the values of in-tegral or mean features can be useful for analytical tasks in time series min-ing, but these values are meaningless to the end user of the system to visu-alise or distinguish it from other patterns. In contrast, a feature like slope of apattern is perceptually interpretable for the user in natural language. Amongthe various features in the literature of feature-based time series data min-ing [83,107,160,203], the following features have been chosen as the initial setof features Fp = {Xi = 〈HXi

, IXi〉}:

Xα : 〈‘Slope’, (−π,π)〉 (the slope of pattern),XΔmm : 〈‘Min− Max Diff’, [0, inf)〉 (absolute difference between min and maxvalues),XΔse : 〈‘Start− End Diff’, [0, inf)〉 (difference between start and end values),XΔt : 〈‘Time interval’, (0, inf)〉 (time duration of pattern),Xen : 〈‘Entropy’, [0, inf)〉 (how chaotic is the pattern),Xfft : 〈‘Frequency’, [0, 1]〉1,X∂x : 〈‘First Derivative’, [0, 1]〉,X∂∂x : 〈‘Second Derivative’, [0, 1]〉,Xσ : 〈‘Standard Deviation’, [0, inf)〉.

9.1.1 Domain Specification for Time Series Pattern Data Set

After calculating all these features for every known observation, the conceptualspace construction approach has been applied with the inputs of known la-belled observations Dp, label set Yp, and feature set Fp. The approach first

1For some features, Ii needs to be set manually based on a mapping function from feature’svalues to one value in an interval. For example, the outputs of Fourier transform (fft) function of apattern is mapped to the values in the interval [0.1], likewise for first and second derivatives

9.1.CONSTRUCTINGCONCEPTUALSPACEOFPATTERNS141

Figure9.1:Foursetsoftimeseriespatterns,presentingtheknownclassesofpatternsinthedataset.

guisticform.Forexample,intimeseriespatterndataset,thevaluesofin-tegralormeanfeaturescanbeusefulforanalyticaltasksintimeseriesmin-ing,butthesevaluesaremeaninglesstotheenduserofthesystemtovisu-aliseordistinguishitfromotherpatterns.Incontrast,afeaturelikeslopeofapatternisperceptuallyinterpretablefortheuserinnaturallanguage.Amongthevariousfeaturesintheliteratureoffeature-basedtimeseriesdatamin-ing[83,107,160,203],thefollowingfeatureshavebeenchosenastheinitialsetoffeaturesF

p={Xi=〈HXi,IXi〉}:

Xα:〈‘Slope’,(−π,π)〉(theslopeofpattern),XΔmm:〈‘Min−MaxDiff’,[0,inf)〉(absolutedifferencebetweenminandmaxvalues),XΔse:〈‘Start−EndDiff’,[0,inf)〉(differencebetweenstartandendvalues),XΔt:〈‘Timeinterval’,(0,inf)〉(timedurationofpattern),Xen:〈‘Entropy’,[0,inf)〉(howchaoticisthepattern),Xfft:〈‘Frequency’,[0,1]〉1,X∂x:〈‘FirstDerivative’,[0,1]〉,X∂∂x:〈‘SecondDerivative’,[0,1]〉,Xσ:〈‘StandardDeviation’,[0,inf)〉.

9.1.1DomainSpecificationforTimeSeriesPatternDataSet

Aftercalculatingallthesefeaturesforeveryknownobservation,theconceptualspaceconstructionapproachhasbeenappliedwiththeinputsofknownla-belledobservationsD

p,labelsetY

p,andfeaturesetF

p.Theapproachfirst

1Forsomefeatures,Iineedstobesetmanuallybasedonamappingfunctionfromfeature’svaluestoonevalueinaninterval.Forexample,theoutputsofFouriertransform(fft)functionofapatternismappedtothevaluesintheinterval[0.1],likewiseforfirstandsecondderivatives


Figure 9.1: Four sets of time series patterns, presenting the known classes ofpatterns in the data set.

guistic form. For example, in time series pattern data set, the values of in-tegral or mean features can be useful for analytical tasks in time series min-ing, but these values are meaningless to the end user of the system to visu-alise or distinguish it from other patterns. In contrast, a feature like slope of apattern is perceptually interpretable for the user in natural language. Amongthe various features in the literature of feature-based time series data min-ing [83,107,160,203], the following features have been chosen as the initial setof features Fp = {Xi = 〈HXi

, IXi〉}:

Xα : 〈‘Slope’, (−π,π)〉 (the slope of pattern),XΔmm : 〈‘Min− Max Diff’, [0, inf)〉 (absolute difference between min and maxvalues),XΔse : 〈‘Start− End Diff’, [0, inf)〉 (difference between start and end values),XΔt : 〈‘Time interval’, (0, inf)〉 (time duration of pattern),Xen : 〈‘Entropy’, [0, inf)〉 (how chaotic is the pattern),Xfft : 〈‘Frequency’, [0, 1]〉1,X∂x : 〈‘First Derivative’, [0, 1]〉,X∂∂x : 〈‘Second Derivative’, [0, 1]〉,Xσ : 〈‘Standard Deviation’, [0, inf)〉.

9.1.1 Domain Specification for Time Series Pattern Data Set

After calculating all these features for every known observation, the conceptualspace construction approach has been applied with the inputs of known la-belled observations Dp, label set Yp, and feature set Fp. The approach first

1For some features, Ii needs to be set manually based on a mapping function from feature’svalues to one value in an interval. For example, the outputs of Fourier transform (fft) function of apattern is mapped to the values in the interval [0.1], likewise for first and second derivatives


Figure9.1:Foursetsoftimeseriespatterns,presentingtheknownclassesofpatternsinthedataset.

guisticform.Forexample,intimeseriespatterndataset,thevaluesofin-tegralormeanfeaturescanbeusefulforanalyticaltasksintimeseriesmin-ing,butthesevaluesaremeaninglesstotheenduserofthesystemtovisu-aliseordistinguishitfromotherpatterns.Incontrast,afeaturelikeslopeofapatternisperceptuallyinterpretablefortheuserinnaturallanguage.Amongthevariousfeaturesintheliteratureoffeature-basedtimeseriesdatamin-ing[83,107,160,203],thefollowingfeatureshavebeenchosenastheinitialsetoffeaturesF

p={Xi=〈HXi,IXi〉}:

Xα:〈‘Slope’,(−π,π)〉(theslopeofpattern),XΔmm:〈‘Min−MaxDiff’,[0,inf)〉(absolutedifferencebetweenminandmaxvalues),XΔse:〈‘Start−EndDiff’,[0,inf)〉(differencebetweenstartandendvalues),XΔt:〈‘Timeinterval’,(0,inf)〉(timedurationofpattern),Xen:〈‘Entropy’,[0,inf)〉(howchaoticisthepattern),Xfft:〈‘Frequency’,[0,1]〉1,X∂x:〈‘FirstDerivative’,[0,1]〉,X∂∂x:〈‘SecondDerivative’,[0,1]〉,Xσ:〈‘StandardDeviation’,[0,inf)〉.

9.1.1DomainSpecificationforTimeSeriesPatternDataSet

Aftercalculatingallthesefeaturesforeveryknownobservation,theconceptualspaceconstructionapproachhasbeenappliedwiththeinputsofknownla-belledobservationsD

p,labelsetY

p,andfeaturesetF

p.Theapproachfirst

1Forsomefeatures,Iineedstobesetmanuallybasedonamappingfunctionfromfeature’svaluestoonevalueinaninterval.Forexample,theoutputsofFouriertransform(fft)functionofapatternismappedtothevaluesintheinterval[0.1],likewiseforfirstandsecondderivatives


yin yde ysp yos

Xα XΔmm XΔse XΔt Xen Xfft X∂x X∂∂x Xσ

Figure 9.2: The bipartite graph presenting the relevance of the features and thelabels in data set of time series patterns. Also, two chosen bicliques (as thedomains) are highlighted with the blue and red edges.

utilises the feature filtering approach, i.e. MIFS (Algorithm 3.1) to providea ranking matrix which shows the mutual correlation of features and labels.Then, the feature subset grouping determines which subsets of features as do-mains represent which labels as concepts using the Algorithm 3.2. Figure 9.2illustrates the created bipartite graph, which presents the specified domains andquality dimensions. Two selected maximum bicliques determines two domainsΔ = {δ1, δ2}, where each domain is specified as follows:

• Domain δ1 = 〈Q(δ1),C(δ1),ωδ1〉, whereinQ(δ1) = {qα, qΔse},C(δ1) = {Cin,Cde}.

• Domain δ2 = 〈Q(δ2),C(δ2),ωδ2〉, whereinQ(δ2) = {qΔt,qΔmm,qfft},C(δ2) = {Csp,Cos}.

Figure 9.3 depicts a graphical presentation of the determined domainswith the corresponding quality dimensions and concepts for the knowntime series patterns. As an example, δ1 is specified by two quality dimen-sions ‘start− end diff’ and ‘slope’, and is associated with two concepts‘Increasing’ and ‘Decreasing’. An example of the calculated weights in a do-main is ωδ1(Cin,qα) = 0.62, which shows the salience of the relation betweenpattern concept ‘Increasing’ and quality dimension ‘slope’ within δ1. Similarto the leaf conceptual space, although the process of specifying the domains isdata-driven, there may be an interpretation for each determined domain. Here,the interpretation of perceived domains is more sensible. For instance, one cansay that δ1 illustrates the trend direction of the known patterns, while δ2 showsthe shape of the known patterns (see Figure 5.3). As the output of the do-main specification phase for the conceptual space of patterns, the set of qualitydimensions will be Qp = {qα, qΔse, qΔt, qΔmm, qfft}. Moreover, the set ofinstances is defined as: Γp = ∪y∈Y Γ(y), where |Γp| = |Dp|.


yinydeyspyos

XαXΔmmXΔseXΔtXenXfftX∂xX∂∂xXσ

Figure9.2:Thebipartitegraphpresentingtherelevanceofthefeaturesandthelabelsindatasetoftimeseriespatterns.Also,twochosenbicliques(asthedomains)arehighlightedwiththeblueandrededges.

utilisesthefeaturefilteringapproach,i.e.MIFS(Algorithm3.1)toprovidearankingmatrixwhichshowsthemutualcorrelationoffeaturesandlabels.Then,thefeaturesubsetgroupingdetermineswhichsubsetsoffeaturesasdo-mainsrepresentwhichlabelsasconceptsusingtheAlgorithm3.2.Figure9.2illustratesthecreatedbipartitegraph,whichpresentsthespecifieddomainsandqualitydimensions.TwoselectedmaximumbicliquesdeterminestwodomainsΔ={δ1,δ2},whereeachdomainisspecifiedasfollows:

•Domainδ1=〈Q(δ1),C(δ1),ωδ1〉,whereinQ(δ1)={qα,qΔse},C(δ1)={Cin,Cde}.

•Domainδ2=〈Q(δ2),C(δ2),ωδ2〉,whereinQ(δ2)={qΔt,qΔmm,qfft},C(δ2)={Csp,Cos}.

Figure9.3depictsagraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconceptsfortheknowntimeseriespatterns.Asanexample,δ1isspecifiedbytwoqualitydimen-sions‘start−enddiff’and‘slope’,andisassociatedwithtwoconcepts‘Increasing’and‘Decreasing’.Anexampleofthecalculatedweightsinado-mainisωδ1(Cin,qα)=0.62,whichshowsthesalienceoftherelationbetweenpatternconcept‘Increasing’andqualitydimension‘slope’withinδ1.Similartotheleafconceptualspace,althoughtheprocessofspecifyingthedomainsisdata-driven,theremaybeaninterpretationforeachdetermineddomain.Here,theinterpretationofperceiveddomainsismoresensible.Forinstance,onecansaythatδ1illustratesthetrenddirectionoftheknownpatterns,whileδ2showstheshapeoftheknownpatterns(seeFigure5.3).Astheoutputofthedo-mainspecificationphasefortheconceptualspaceofpatterns,thesetofqualitydimensionswillbeQ

p={qα,qΔse,qΔt,qΔmm,qfft}.Moreover,thesetof

instancesisdefinedas:Γp=∪y∈YΓ(y),where|Γ

p|=|D

p|.


yin yde ysp yos

Xα XΔmm XΔse XΔt Xen Xfft X∂x X∂∂x Xσ

Figure 9.2: The bipartite graph presenting the relevance of the features and thelabels in data set of time series patterns. Also, two chosen bicliques (as thedomains) are highlighted with the blue and red edges.

utilises the feature filtering approach, i.e. MIFS (Algorithm 3.1) to providea ranking matrix which shows the mutual correlation of features and labels.Then, the feature subset grouping determines which subsets of features as do-mains represent which labels as concepts using the Algorithm 3.2. Figure 9.2illustrates the created bipartite graph, which presents the specified domains andquality dimensions. Two selected maximum bicliques determines two domainsΔ = {δ1, δ2}, where each domain is specified as follows:

• Domain δ1 = 〈Q(δ1),C(δ1),ωδ1〉, whereinQ(δ1) = {qα, qΔse},C(δ1) = {Cin,Cde}.

• Domain δ2 = 〈Q(δ2),C(δ2),ωδ2〉, whereinQ(δ2) = {qΔt,qΔmm,qfft},C(δ2) = {Csp,Cos}.

Figure 9.3 depicts a graphical presentation of the determined domainswith the corresponding quality dimensions and concepts for the knowntime series patterns. As an example, δ1 is specified by two quality dimen-sions ‘start− end diff’ and ‘slope’, and is associated with two concepts‘Increasing’ and ‘Decreasing’. An example of the calculated weights in a do-main is ωδ1(Cin,qα) = 0.62, which shows the salience of the relation betweenpattern concept ‘Increasing’ and quality dimension ‘slope’ within δ1. Similarto the leaf conceptual space, although the process of specifying the domains isdata-driven, there may be an interpretation for each determined domain. Here,the interpretation of perceived domains is more sensible. For instance, one cansay that δ1 illustrates the trend direction of the known patterns, while δ2 showsthe shape of the known patterns (see Figure 5.3). As the output of the do-main specification phase for the conceptual space of patterns, the set of qualitydimensions will be Qp = {qα, qΔse, qΔt, qΔmm, qfft}. Moreover, the set ofinstances is defined as: Γp = ∪y∈Y Γ(y), where |Γp| = |Dp|.


yinydeyspyos

XαXΔmmXΔseXΔtXenXfftX∂xX∂∂xXσ

Figure9.2:Thebipartitegraphpresentingtherelevanceofthefeaturesandthelabelsindatasetoftimeseriespatterns.Also,twochosenbicliques(asthedomains)arehighlightedwiththeblueandrededges.

utilisesthefeaturefilteringapproach,i.e.MIFS(Algorithm3.1)toprovidearankingmatrixwhichshowsthemutualcorrelationoffeaturesandlabels.Then,thefeaturesubsetgroupingdetermineswhichsubsetsoffeaturesasdo-mainsrepresentwhichlabelsasconceptsusingtheAlgorithm3.2.Figure9.2illustratesthecreatedbipartitegraph,whichpresentsthespecifieddomainsandqualitydimensions.TwoselectedmaximumbicliquesdeterminestwodomainsΔ={δ1,δ2},whereeachdomainisspecifiedasfollows:

•Domainδ1=〈Q(δ1),C(δ1),ωδ1〉,whereinQ(δ1)={qα,qΔse},C(δ1)={Cin,Cde}.

•Domainδ2=〈Q(δ2),C(δ2),ωδ2〉,whereinQ(δ2)={qΔt,qΔmm,qfft},C(δ2)={Csp,Cos}.

Figure9.3depictsagraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimensionsandconceptsfortheknowntimeseriespatterns.Asanexample,δ1isspecifiedbytwoqualitydimen-sions‘start−enddiff’and‘slope’,andisassociatedwithtwoconcepts‘Increasing’and‘Decreasing’.Anexampleofthecalculatedweightsinado-mainisωδ1(Cin,qα)=0.62,whichshowsthesalienceoftherelationbetweenpatternconcept‘Increasing’andqualitydimension‘slope’withinδ1.Similartotheleafconceptualspace,althoughtheprocessofspecifyingthedomainsisdata-driven,theremaybeaninterpretationforeachdetermineddomain.Here,theinterpretationofperceiveddomainsismoresensible.Forinstance,onecansaythatδ1illustratesthetrenddirectionoftheknownpatterns,whileδ2showstheshapeoftheknownpatterns(seeFigure5.3).Astheoutputofthedo-mainspecificationphasefortheconceptualspaceofpatterns,thesetofqualitydimensionswillbeQ

p={qα,qΔse,qΔt,qΔmm,qfft}.Moreover,thesetof

instancesisdefinedas:Γp=∪y∈YΓ(y),where|Γ

p|=|D

p|.


Figure 9.3: The conceptual space of time series pattern data set: a graphicalpresentation of the determined domains with the corresponding quality dimen-sions and concepts.

9.1.2 Concept Representation for Pattern Concepts

Regarding the output of the domain specification process, each concept in Cp

appears in only one domain (has precisely one sub-concept), as Cy = {cy}. Byapplying Algorithm 3.3, the elements of the sub-concept for each concept in Cp

is derived as follows.

• Pattern concepts ‘Increasing’ and ‘Decreasing’ are represented in δ1 as,respectively:Cin = {c1

in : 〈η1in,φ

1in〉},

Cde = {c1de : 〈η1

de,φ1de〉}.

• Pattern concepts ‘Spike’ and ‘Oscillation’ are represented in δ1 as, re-spectively:Csp = {c2

sp : 〈η2sp,φ

2sp〉},

Cos = {c2os : 〈η2

os,φ2os〉}.

In these representations, for example, η2inc shows the 3D convex polytope

of pattern concept ′Spike ′ within δ2 (see Figure 9.3). Also, as an examplefor the weights, φ2

sp = {ωδ2(Csp,qΔt), ωδ2(Csp,qΔmm), ωδ2(Csp,qfft)} showsthe salience between pattern concept ‘Spike’ and three quality dimensions‘time interval’, ‘min− max diff’, and ‘frequency’ within δ2. In Figure 9.3,the graphical presentation of time series pattern concepts is shown by illustrat-ing the convex hulls of their corresponding sub-concepts.


Figure9.3:Theconceptualspaceoftimeseriespatterndataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimen-sionsandconcepts.

9.1.2ConceptRepresentationforPatternConcepts

Regardingtheoutputofthedomainspecificationprocess,eachconceptinCp

appearsinonlyonedomain(haspreciselyonesub-concept),asCy={cy}.ByapplyingAlgorithm3.3,theelementsofthesub-conceptforeachconceptinC

p

isderivedasfollows.

•Patternconcepts‘Increasing’and‘Decreasing’arerepresentedinδ1as,respectively:Cin={c1

in:〈η1in,φ1

in〉},Cde={c1

de:〈η1de,φ1

de〉}.•Patternconcepts‘Spike’and‘Oscillation’arerepresentedinδ1as,re-

spectively:Csp={c2

sp:〈η2sp,φ2

sp〉},Cos={c2

os:〈η2os,φ2

os〉}.

Intheserepresentations,forexample,η2incshowsthe3Dconvexpolytope

ofpatternconcept′Spike′withinδ2(seeFigure9.3).Also,asanexamplefortheweights,φ2

sp={ωδ2(Csp,qΔt),ωδ2(Csp,qΔmm),ωδ2(Csp,qfft)}showsthesaliencebetweenpatternconcept‘Spike’andthreequalitydimensions‘timeinterval’,‘min−maxdiff’,and‘frequency’withinδ2.InFigure9.3,thegraphicalpresentationoftimeseriespatternconceptsisshownbyillustrat-ingtheconvexhullsoftheircorrespondingsub-concepts.


Figure 9.3: The conceptual space of time series pattern data set: a graphicalpresentation of the determined domains with the corresponding quality dimen-sions and concepts.

9.1.2 Concept Representation for Pattern Concepts

Regarding the output of the domain specification process, each concept in Cp

appears in only one domain (has precisely one sub-concept), as Cy = {cy}. Byapplying Algorithm 3.3, the elements of the sub-concept for each concept in Cp

is derived as follows.

• Pattern concepts ‘Increasing’ and ‘Decreasing’ are represented in δ1 as,respectively:Cin = {c1

in : 〈η1in,φ

1in〉},

Cde = {c1de : 〈η1

de,φ1de〉}.

• Pattern concepts ‘Spike’ and ‘Oscillation’ are represented in δ1 as, re-spectively:Csp = {c2

sp : 〈η2sp,φ

2sp〉},

Cos = {c2os : 〈η2

os,φ2os〉}.

In these representations, for example, η2inc shows the 3D convex polytope

of pattern concept ′Spike ′ within δ2 (see Figure 9.3). Also, as an examplefor the weights, φ2

sp = {ωδ2(Csp,qΔt), ωδ2(Csp,qΔmm), ωδ2(Csp,qfft)} showsthe salience between pattern concept ‘Spike’ and three quality dimensions‘time interval’, ‘min− max diff’, and ‘frequency’ within δ2. In Figure 9.3,the graphical presentation of time series pattern concepts is shown by illustrat-ing the convex hulls of their corresponding sub-concepts.


Figure9.3:Theconceptualspaceoftimeseriespatterndataset:agraphicalpresentationofthedetermineddomainswiththecorrespondingqualitydimen-sionsandconcepts.

9.1.2ConceptRepresentationforPatternConcepts

Regardingtheoutputofthedomainspecificationprocess,eachconceptinCp

appearsinonlyonedomain(haspreciselyonesub-concept),asCy={cy}.ByapplyingAlgorithm3.3,theelementsofthesub-conceptforeachconceptinC

p

isderivedasfollows.

•Patternconcepts‘Increasing’and‘Decreasing’arerepresentedinδ1as,respectively:Cin={c1

in:〈η1in,φ1

in〉},Cde={c1

de:〈η1de,φ1

de〉}.•Patternconcepts‘Spike’and‘Oscillation’arerepresentedinδ1as,re-

spectively:Csp={c2

sp:〈η2sp,φ2

sp〉},Cos={c2

os:〈η2os,φ2

os〉}.

Intheserepresentations,forexample,η2incshowsthe3Dconvexpolytope

ofpatternconcept′Spike′withinδ2(seeFigure9.3).Also,asanexamplefortheweights,φ2

sp={ωδ2(Csp,qΔt),ωδ2(Csp,qΔmm),ωδ2(Csp,qfft)}showsthesaliencebetweenpatternconcept‘Spike’andthreequalitydimensions‘timeinterval’,‘min−maxdiff’,and‘frequency’withinδ2.InFigure9.3,thegraphicalpresentationoftimeseriespatternconceptsisshownbyillustrat-ingtheconvexhullsoftheircorrespondingsub-concepts.


Figure 9.4: A set of unknown samples of time series patterns.

Now, with the provided elements, the conceptual space of the time seriespattern data set is presented as: Spatterns = 〈 Qp, Δp, Cp, Γp 〉.

9.2 Semantic Inference for Unknown Patterns

The aim is to derive a linguistic description for unknown time series patterns.Figure 9.4 shows a number of examples of unknown patterns that are chosen tobe considered. According to the proposed inference process, an unknown pat-tern sample (e.g., pattern (a) in Figure 9.4) is first vectorised to an instance γa.Then, a linguistic description for (a) is inferred in two phases: set the values ofsymbol vector and set the lexicons, by inferring in conceptual space and sym-bol space, respectively. γa is a set of points within Δp(γa) as: γa = {p1

γa,p2

γa},

where the numeric values of each point are the feature value of (a) for eachquality dimension. For example, in δ1: p2

γa=〈qα(a), qΔse(a)〉=〈0.34, 0.28〉.

9.2.1 Inference in Conceptual Space of Patterns

In task of this phase is basically to check whether or not the new instance γa isincluded in any defined concept’s regions, and then infer semantic descriptionsbased on the closeness of its values to the regions. Considering the pattern sam-ple (a) in Figure 9.1, γa belongs to the sub-concept c1

de in δ1, but it does notbelong to any sub-concept in δ2. Based on Algorithm 4.1, the symbol vectorfor γa is set as follows: In the concept layer, using the graded membershipfunction (defined in Definition 4.2): Vγa,C(Cde) = (‘Decreasing’, 1). In thequality layer, for the quality dimensions of δ2, using the graded quality func-tion (defined in Definition 4.3): Vγa,Q(q

2Δt) = (‘long’, 0.75), Vγa,Q(q

2Δmm) =

(‘medium range’, 0.62), and Vγa,Q(q2fft) = (‘fluctuate’, 0.7).


Figure9.4:Asetofunknownsamplesoftimeseriespatterns.

Now,withtheprovidedelements,theconceptualspaceofthetimeseriespatterndatasetispresentedas:Spatterns=〈Q

p,Δ

p,C

p,Γ

p〉.

9.2SemanticInferenceforUnknownPatterns

Theaimistoderivealinguisticdescriptionforunknowntimeseriespatterns.Figure9.4showsanumberofexamplesofunknownpatternsthatarechosentobeconsidered.Accordingtotheproposedinferenceprocess,anunknownpat-ternsample(e.g.,pattern(a)inFigure9.4)isfirstvectorisedtoaninstanceγa.Then,alinguisticdescriptionfor(a)isinferredintwophases:setthevaluesofsymbolvectorandsetthelexicons,byinferringinconceptualspaceandsym-bolspace,respectively.γaisasetofpointswithinΔ

p(γa)as:γa={p1

γa,p2γa},

wherethenumericvaluesofeachpointarethefeaturevalueof(a)foreachqualitydimension.Forexample,inδ1:p2

γa=〈qα(a),qΔse(a)〉=〈0.34,0.28〉.

9.2.1InferenceinConceptualSpaceofPatterns

Intaskofthisphaseisbasicallytocheckwhetherornotthenewinstanceγaisincludedinanydefinedconcept’sregions,andtheninfersemanticdescriptionsbasedontheclosenessofitsvaluestotheregions.Consideringthepatternsam-ple(a)inFigure9.1,γabelongstothesub-conceptc1

deinδ1,butitdoesnotbelongtoanysub-conceptinδ2.BasedonAlgorithm4.1,thesymbolvectorforγaissetasfollows:Intheconceptlayer,usingthegradedmembershipfunction(definedinDefinition4.2):Vγa,C(Cde)=(‘Decreasing’,1).Inthequalitylayer,forthequalitydimensionsofδ2,usingthegradedqualityfunc-tion(definedinDefinition4.3):Vγa,Q(q2

Δt)=(‘long’,0.75),Vγa,Q(q2Δmm)=

(‘mediumrange’,0.62),andVγa,Q(q2fft)=(‘fluctuate’,0.7).


Figure 9.4: A set of unknown samples of time series patterns.

Now, with the provided elements, the conceptual space of the time seriespattern data set is presented as: Spatterns = 〈 Qp, Δp, Cp, Γp 〉.

9.2 Semantic Inference for Unknown Patterns

The aim is to derive a linguistic description for unknown time series patterns.Figure 9.4 shows a number of examples of unknown patterns that are chosen tobe considered. According to the proposed inference process, an unknown pat-tern sample (e.g., pattern (a) in Figure 9.4) is first vectorised to an instance γa.Then, a linguistic description for (a) is inferred in two phases: set the values ofsymbol vector and set the lexicons, by inferring in conceptual space and sym-bol space, respectively. γa is a set of points within Δp(γa) as: γa = {p1

γa,p2

γa},

where the numeric values of each point are the feature value of (a) for eachquality dimension. For example, in δ1: p2

γa=〈qα(a), qΔse(a)〉=〈0.34, 0.28〉.

9.2.1 Inference in Conceptual Space of Patterns

In task of this phase is basically to check whether or not the new instance γa isincluded in any defined concept’s regions, and then infer semantic descriptionsbased on the closeness of its values to the regions. Considering the pattern sam-ple (a) in Figure 9.1, γa belongs to the sub-concept c1

de in δ1, but it does notbelong to any sub-concept in δ2. Based on Algorithm 4.1, the symbol vectorfor γa is set as follows: In the concept layer, using the graded membershipfunction (defined in Definition 4.2): Vγa,C(Cde) = (‘Decreasing’, 1). In thequality layer, for the quality dimensions of δ2, using the graded quality func-tion (defined in Definition 4.3): Vγa,Q(q

2Δt) = (‘long’, 0.75), Vγa,Q(q

2Δmm) =

(‘medium range’, 0.62), and Vγa,Q(q2fft) = (‘fluctuate’, 0.7).


Figure9.4:Asetofunknownsamplesoftimeseriespatterns.

Now,withtheprovidedelements,theconceptualspaceofthetimeseriespatterndatasetispresentedas:Spatterns=〈Q

p,Δ

p,C

p,Γ

p〉.

9.2SemanticInferenceforUnknownPatterns

Theaimistoderivealinguisticdescriptionforunknowntimeseriespatterns.Figure9.4showsanumberofexamplesofunknownpatternsthatarechosentobeconsidered.Accordingtotheproposedinferenceprocess,anunknownpat-ternsample(e.g.,pattern(a)inFigure9.4)isfirstvectorisedtoaninstanceγa.Then,alinguisticdescriptionfor(a)isinferredintwophases:setthevaluesofsymbolvectorandsetthelexicons,byinferringinconceptualspaceandsym-bolspace,respectively.γaisasetofpointswithinΔ

p(γa)as:γa={p1

γa,p2γa},

wherethenumericvaluesofeachpointarethefeaturevalueof(a)foreachqualitydimension.Forexample,inδ1:p2

γa=〈qα(a),qΔse(a)〉=〈0.34,0.28〉.

9.2.1InferenceinConceptualSpaceofPatterns

Intaskofthisphaseisbasicallytocheckwhetherornotthenewinstanceγaisincludedinanydefinedconcept’sregions,andtheninfersemanticdescriptionsbasedontheclosenessofitsvaluestotheregions.Consideringthepatternsam-ple(a)inFigure9.1,γabelongstothesub-conceptc1

deinδ1,butitdoesnotbelongtoanysub-conceptinδ2.BasedonAlgorithm4.1,thesymbolvectorforγaissetasfollows:Intheconceptlayer,usingthegradedmembershipfunction(definedinDefinition4.2):Vγa,C(Cde)=(‘Decreasing’,1).Inthequalitylayer,forthequalitydimensionsofδ2,usingthegradedqualityfunc-tion(definedinDefinition4.3):Vγa,Q(q2

Δt)=(‘long’,0.75),Vγa,Q(q2Δmm)=

(‘mediumrange’,0.62),andVγa,Q(q2fft)=(‘fluctuate’,0.7).

9.2. SEMANTIC INFERENCE FOR UNKNOWN PATTERNS 145

9.2.2 Inference in Symbol Space of Patterns

By retrieving the information of symbol vector V(γa), it is possible to ver-balise the elements of symbol vectors into a set of natural language descrip-tions. As mentioned in Section 4.2.2, γa is annotated by the values of Vγa,C,and characterised by the values of Vγa,C. In particular, the annotation set isTC(γa) =‘Decreasing’, and the characterisation set will be TQ(γa) = { ‘long’,‘medium range’, ‘fluctuate’ }. Then the realisation for γa is as follows: Tγa

=‘Decreasing, also fluctuates and it is long with medium range’. Table 9.1present more results derived from the semantic inference in the conceptualspace for the time series patterns shown in Figure 9.4.

9.2.SEMANTICINFERENCEFORUNKNOWNPATTERNS145

9.2.2InferenceinSymbolSpaceofPatterns

ByretrievingtheinformationofsymbolvectorV(γa),itispossibletover-balisetheelementsofsymbolvectorsintoasetofnaturallanguagedescrip-tions.AsmentionedinSection4.2.2,γaisannotatedbythevaluesofVγa,C,andcharacterisedbythevaluesofVγa,C.Inparticular,theannotationsetisTC(γa)=‘Decreasing’,andthecharacterisationsetwillbeTQ(γa)={‘long’,‘mediumrange’,‘fluctuate’}.Thentherealisationforγaisasfollows:Tγa=‘Decreasing,alsofluctuatesanditislongwithmediumrange’.Table9.1presentmoreresultsderivedfromthesemanticinferenceintheconceptualspaceforthetimeseriespatternsshowninFigure9.4.

9.2. SEMANTIC INFERENCE FOR UNKNOWN PATTERNS 145

9.2.2 Inference in Symbol Space of Patterns

By retrieving the information of symbol vector V(γa), it is possible to ver-balise the elements of symbol vectors into a set of natural language descrip-tions. As mentioned in Section 4.2.2, γa is annotated by the values of Vγa,C,and characterised by the values of Vγa,C. In particular, the annotation set isTC(γa) =‘Decreasing’, and the characterisation set will be TQ(γa) = { ‘long’,‘medium range’, ‘fluctuate’ }. Then the realisation for γa is as follows: Tγa

=‘Decreasing, also fluctuates and it is long with medium range’. Table 9.1present more results derived from the semantic inference in the conceptualspace for the time series patterns shown in Figure 9.4.

9.2.SEMANTICINFERENCEFORUNKNOWNPATTERNS145

9.2.2InferenceinSymbolSpaceofPatterns

ByretrievingtheinformationofsymbolvectorV(γa),itispossibletover-balisetheelementsofsymbolvectorsintoasetofnaturallanguagedescrip-tions.AsmentionedinSection4.2.2,γaisannotatedbythevaluesofVγa,C,andcharacterisedbythevaluesofVγa,C.Inparticular,theannotationsetisTC(γa)=‘Decreasing’,andthecharacterisationsetwillbeTQ(γa)={‘long’,‘mediumrange’,‘fluctuate’}.Thentherealisationforγaisasfollows:Tγa=‘Decreasing,alsofluctuatesanditislongwithmediumrange’.Table9.1presentmoreresultsderivedfromthesemanticinferenceintheconceptualspaceforthetimeseriespatternsshowninFigure9.4.


Patterns Linguistic DescriptionFig. 9.4(a) This pattern is an Increasing pattern, but it is smooth and very

short, within a medium range of values.

Fig. 9.4(b) This pattern is like a Spike pattern, but it is noisy with a sharpdecreasing trend in a high range of values.

Fig. 9.4(c) This pattern is an Oscillation pattern, within a very short range ofvalues.

Fig. 9.4(d) This pattern is an Increasing pattern, but it fluctuates in a very longduration, within a large range of values.

Fig. 9.4(e) This pattern is an Increasing pattern, but it fluctuates within amedium range of values.

Fig. 9.4(f) This pattern is not like any known pattern, but it fluctuates withina high range of values.

Fig. 9.4(g) This pattern is an Increasing pattern, but it is long, within amedium range of values.

Fig. 9.4(h) This pattern is like Decreasing Spike pattern, in a short duration.

Fig. 9.4(i) This pattern is like a Spike pattern, but it has a sharp rise andfluctuation, within a medium range of values.

Fig. 9.4(j) This pattern is a Decreasing pattern, but it fluctuates in a longduration, within a medium range of values.

Fig. 9.4(k) This pattern is a Spike pattern, with the same start and end values.

Fig. 9.4(l) This pattern is like a Spike pattern, but it has a smooth and slowincreasing trend, within a medium range of values.

Fig. 9.4(m) This pattern is not like any known pattern, but it wavy within amedium range, and very long duration. Also, it has the same startand end values.

Fig. 9.4(n) This pattern is not like any known pattern, but it has a normaldecreasing trend in a very long duration. Also, it is very fluctuatingin a large range of values.

Fig. 9.4(o) This pattern is like Decreasing Oscillation pattern.

Fig. 9.4(p) This pattern is like a Spike pattern, but it is very short and smooth,within a high range of values

Fig. 9.4(q) This pattern is like Spike and Decreasing patterns.

Table 9.1: The linguistic descriptions derived for the unknown samples of timeseries patterns in Figure 9.4.


PatternsLinguisticDescriptionFig.9.4(a)ThispatternisanIncreasingpattern,butitissmoothandvery

short,withinamediumrangeofvalues.

Fig.9.4(b)ThispatternislikeaSpikepattern,butitisnoisywithasharpdecreasingtrendinahighrangeofvalues.

Fig.9.4(c)ThispatternisanOscillationpattern,withinaveryshortrangeofvalues.

Fig.9.4(d)ThispatternisanIncreasingpattern,butitfluctuatesinaverylongduration,withinalargerangeofvalues.

Fig.9.4(e)ThispatternisanIncreasingpattern,butitfluctuateswithinamediumrangeofvalues.

Fig.9.4(f)Thispatternisnotlikeanyknownpattern,butitfluctuateswithinahighrangeofvalues.

Fig.9.4(g)ThispatternisanIncreasingpattern,butitislong,withinamediumrangeofvalues.

Fig.9.4(h)ThispatternislikeDecreasingSpikepattern,inashortduration.

Fig.9.4(i)ThispatternislikeaSpikepattern,butithasasharpriseandfluctuation,withinamediumrangeofvalues.

Fig.9.4(j)ThispatternisaDecreasingpattern,butitfluctuatesinalongduration,withinamediumrangeofvalues.

Fig.9.4(k)ThispatternisaSpikepattern,withthesamestartandendvalues.

Fig.9.4(l)ThispatternislikeaSpikepattern,butithasasmoothandslowincreasingtrend,withinamediumrangeofvalues.

Fig.9.4(m)Thispatternisnotlikeanyknownpattern,butitwavywithinamediumrange,andverylongduration.Also,ithasthesamestartandendvalues.

Fig.9.4(n)Thispatternisnotlikeanyknownpattern,butithasanormaldecreasingtrendinaverylongduration.Also,itisveryfluctuatinginalargerangeofvalues.

Fig.9.4(o)ThispatternislikeDecreasingOscillationpattern.

Fig.9.4(p)ThispatternislikeaSpikepattern,butitisveryshortandsmooth,withinahighrangeofvalues

Fig.9.4(q)ThispatternislikeSpikeandDecreasingpatterns.

Table9.1:ThelinguisticdescriptionsderivedfortheunknownsamplesoftimeseriespatternsinFigure9.4.


Patterns Linguistic DescriptionFig. 9.4(a) This pattern is an Increasing pattern, but it is smooth and very

short, within a medium range of values.

Fig. 9.4(b) This pattern is like a Spike pattern, but it is noisy with a sharpdecreasing trend in a high range of values.

Fig. 9.4(c) This pattern is an Oscillation pattern, within a very short range ofvalues.

Fig. 9.4(d) This pattern is an Increasing pattern, but it fluctuates in a very longduration, within a large range of values.

Fig. 9.4(e) This pattern is an Increasing pattern, but it fluctuates within amedium range of values.

Fig. 9.4(f) This pattern is not like any known pattern, but it fluctuates withina high range of values.

Fig. 9.4(g) This pattern is an Increasing pattern, but it is long, within amedium range of values.

Fig. 9.4(h) This pattern is like Decreasing Spike pattern, in a short duration.

Fig. 9.4(i) This pattern is like a Spike pattern, but it has a sharp rise andfluctuation, within a medium range of values.

Fig. 9.4(j) This pattern is a Decreasing pattern, but it fluctuates in a longduration, within a medium range of values.

Fig. 9.4(k) This pattern is a Spike pattern, with the same start and end values.

Fig. 9.4(l) This pattern is like a Spike pattern, but it has a smooth and slowincreasing trend, within a medium range of values.

Fig. 9.4(m) This pattern is not like any known pattern, but it wavy within amedium range, and very long duration. Also, it has the same startand end values.

Fig. 9.4(n) This pattern is not like any known pattern, but it has a normaldecreasing trend in a very long duration. Also, it is very fluctuatingin a large range of values.

Fig. 9.4(o) This pattern is like Decreasing Oscillation pattern.

Fig. 9.4(p) This pattern is like a Spike pattern, but it is very short and smooth,within a high range of values

Fig. 9.4(q) This pattern is like Spike and Decreasing patterns.

Table 9.1: The linguistic descriptions derived for the unknown samples of timeseries patterns in Figure 9.4.


PatternsLinguisticDescriptionFig.9.4(a)ThispatternisanIncreasingpattern,butitissmoothandvery

short,withinamediumrangeofvalues.

Fig.9.4(b)ThispatternislikeaSpikepattern,butitisnoisywithasharpdecreasingtrendinahighrangeofvalues.

Fig.9.4(c)ThispatternisanOscillationpattern,withinaveryshortrangeofvalues.

Fig.9.4(d)ThispatternisanIncreasingpattern,butitfluctuatesinaverylongduration,withinalargerangeofvalues.

Fig.9.4(e)ThispatternisanIncreasingpattern,butitfluctuateswithinamediumrangeofvalues.

Fig.9.4(f)Thispatternisnotlikeanyknownpattern,butitfluctuateswithinahighrangeofvalues.

Fig.9.4(g)ThispatternisanIncreasingpattern,butitislong,withinamediumrangeofvalues.

Fig.9.4(h)ThispatternislikeDecreasingSpikepattern,inashortduration.

Fig.9.4(i)ThispatternislikeaSpikepattern,butithasasharpriseandfluctuation,withinamediumrangeofvalues.

Fig.9.4(j)ThispatternisaDecreasingpattern,butitfluctuatesinalongduration,withinamediumrangeofvalues.

Fig.9.4(k)ThispatternisaSpikepattern,withthesamestartandendvalues.

Fig.9.4(l)ThispatternislikeaSpikepattern,butithasasmoothandslowincreasingtrend,withinamediumrangeofvalues.

Fig.9.4(m)Thispatternisnotlikeanyknownpattern,butitwavywithinamediumrange,andverylongduration.Also,ithasthesamestartandendvalues.

Fig.9.4(n)Thispatternisnotlikeanyknownpattern,butithasanormaldecreasingtrendinaverylongduration.Also,itisveryfluctuatinginalargerangeofvalues.

Fig.9.4(o)ThispatternislikeDecreasingOscillationpattern.

Fig.9.4(p)ThispatternislikeaSpikepattern,butitisveryshortandsmooth,withinahighrangeofvalues

Fig.9.4(q)ThispatternislikeSpikeandDecreasingpatterns.

Table9.1:ThelinguisticdescriptionsderivedfortheunknownsamplesoftimeseriespatternsinFigure9.4.

9.3. EVALUATION: DESCRIPTIONS FROM CONCEPTUAL SPACES 147

9.3 Evaluation: Descriptions from Conceptual

Spaces vs. Other Semantic Models

As described in Section 5.3 in Part I, assessing the benefits of the proposedconceptual space representation directly is not a trivial problem. Instead, theusefulness of the constructed conceptual space of the time series patterns hasbeen evaluated via the linguistic descriptions derived from such space. Again,the experiment evaluates the following aims: (1) to measure the feasibility of de-riving accurate descriptions to distinguish unknown pattern observations, and(2) to assess the goodness of the descriptions derived from conceptual spaces incomparison to the descriptions derived from other base-line models. To theseends, a survey was conducted in which participants were asked to

1. identify specific time series pattern based on their linguistic descriptionderived from the conceptual space, and


9.3.1 Survey: Design and Procedure for Pattern Data Set

Similar to the survey used for leaf data set, the main body of the survey for pat-terns was composed of two parts, with the same designs of questions explainedin Section 5.3.1. Here, it is worth to mention that the evaluation consideredthe results of 17 unknown patterns from a pool of unknown pattern examples.Moreover, among all responses to the survey, 89 valid responses have been stud-ied for the pattern data set. Most of the participants were in the range of 25-44years old, and they were mostly educated in computer science or equivalent.Besides, most of the participants were fluent in English speaking.

About the expertise level of the participants, the results show that the partic-ipants were more familiar with the terminology that have been used for patterndata set, rather than leaf. As shown in Figure 9.5, for the leaf data set, 20%of the participants knew none of the lexical items, 70% knew few or some ofthem, and only 10% almost all of them (See Figure 9.5a). But for the patterndata set, 30% of the participants knew few, or some of the lexical items, morethan 40% knew most of them, and more than 25% knew all the introduced ter-minology (See Figure 9.5b). Comparing the percentages of the expertise levelshows that the lexicon used in the descriptions of the patterns was more famil-iar to the participants.

9.3.2 Identifying Pattern Observations from Linguistic

Descriptions

Participants were able to successfully identify all the unknown observations(17 patterns) with the help of the corresponding conceptual space descriptions.

9.3.EVALUATION:DESCRIPTIONSFROMCONCEPTUALSPACES147

9.3Evaluation:DescriptionsfromConceptual

Spacesvs.OtherSemanticModels

AsdescribedinSection5.3inPartI,assessingthebenefitsoftheproposedconceptualspacerepresentationdirectlyisnotatrivialproblem.Instead,theusefulnessoftheconstructedconceptualspaceofthetimeseriespatternshasbeenevaluatedviathelinguisticdescriptionsderivedfromsuchspace.Again,theexperimentevaluatesthefollowingaims:(1)tomeasurethefeasibilityofde-rivingaccuratedescriptionstodistinguishunknownpatternobservations,and(2)toassessthegoodnessofthedescriptionsderivedfromconceptualspacesincomparisontothedescriptionsderivedfromotherbase-linemodels.Totheseends,asurveywasconductedinwhichparticipantswereaskedto

1.identifyspecifictimeseriespatternbasedontheirlinguisticdescriptionderivedfromtheconceptualspace,and


9.3.1Survey:DesignandProcedureforPatternDataSet

Similartothesurveyusedforleafdataset,themainbodyofthesurveyforpat-ternswascomposedoftwoparts,withthesamedesignsofquestionsexplainedinSection5.3.1.Here,itisworthtomentionthattheevaluationconsideredtheresultsof17unknownpatternsfromapoolofunknownpatternexamples.Moreover,amongallresponsestothesurvey,89validresponseshavebeenstud-iedforthepatterndataset.Mostoftheparticipantswereintherangeof25-44yearsold,andtheyweremostlyeducatedincomputerscienceorequivalent.Besides,mostoftheparticipantswerefluentinEnglishspeaking.

Abouttheexpertiseleveloftheparticipants,theresultsshowthatthepartic-ipantsweremorefamiliarwiththeterminologythathavebeenusedforpatterndataset,ratherthanleaf.AsshowninFigure9.5,fortheleafdataset,20%oftheparticipantsknewnoneofthelexicalitems,70%knewfeworsomeofthem,andonly10%almostallofthem(SeeFigure9.5a).Butforthepatterndataset,30%oftheparticipantsknewfew,orsomeofthelexicalitems,morethan40%knewmostofthem,andmorethan25%knewalltheintroducedter-minology(SeeFigure9.5b).Comparingthepercentagesoftheexpertiselevelshowsthatthelexiconusedinthedescriptionsofthepatternswasmorefamil-iartotheparticipants.

9.3.2IdentifyingPatternObservationsfromLinguistic

Descriptions

Participantswereabletosuccessfullyidentifyalltheunknownobservations(17patterns)withthehelpofthecorrespondingconceptualspacedescriptions.

9.3. EVALUATION: DESCRIPTIONS FROM CONCEPTUAL SPACES 147

9.3 Evaluation: Descriptions from Conceptual

Spaces vs. Other Semantic Models

As described in Section 5.3 in Part I, assessing the benefits of the proposedconceptual space representation directly is not a trivial problem. Instead, theusefulness of the constructed conceptual space of the time series patterns hasbeen evaluated via the linguistic descriptions derived from such space. Again,the experiment evaluates the following aims: (1) to measure the feasibility of de-riving accurate descriptions to distinguish unknown pattern observations, and(2) to assess the goodness of the descriptions derived from conceptual spaces incomparison to the descriptions derived from other base-line models. To theseends, a survey was conducted in which participants were asked to

1. identify specific time series pattern based on their linguistic descriptionderived from the conceptual space, and


9.3.1 Survey: Design and Procedure for Pattern Data Set

Similar to the survey used for leaf data set, the main body of the survey for pat-terns was composed of two parts, with the same designs of questions explainedin Section 5.3.1. Here, it is worth to mention that the evaluation consideredthe results of 17 unknown patterns from a pool of unknown pattern examples.Moreover, among all responses to the survey, 89 valid responses have been stud-ied for the pattern data set. Most of the participants were in the range of 25-44years old, and they were mostly educated in computer science or equivalent.Besides, most of the participants were fluent in English speaking.

About the expertise level of the participants, the results show that the partic-ipants were more familiar with the terminology that have been used for patterndata set, rather than leaf. As shown in Figure 9.5, for the leaf data set, 20%of the participants knew none of the lexical items, 70% knew few or some ofthem, and only 10% almost all of them (See Figure 9.5a). But for the patterndata set, 30% of the participants knew few, or some of the lexical items, morethan 40% knew most of them, and more than 25% knew all the introduced ter-minology (See Figure 9.5b). Comparing the percentages of the expertise levelshows that the lexicon used in the descriptions of the patterns was more famil-iar to the participants.

9.3.2 Identifying Pattern Observations from Linguistic

Descriptions

Participants were able to successfully identify all the unknown observations(17 patterns) with the help of the corresponding conceptual space descriptions.

9.3.EVALUATION:DESCRIPTIONSFROMCONCEPTUALSPACES147

9.3Evaluation:DescriptionsfromConceptual

Spacesvs.OtherSemanticModels

AsdescribedinSection5.3inPartI,assessingthebenefitsoftheproposedconceptualspacerepresentationdirectlyisnotatrivialproblem.Instead,theusefulnessoftheconstructedconceptualspaceofthetimeseriespatternshasbeenevaluatedviathelinguisticdescriptionsderivedfromsuchspace.Again,theexperimentevaluatesthefollowingaims:(1)tomeasurethefeasibilityofde-rivingaccuratedescriptionstodistinguishunknownpatternobservations,and(2)toassessthegoodnessofthedescriptionsderivedfromconceptualspacesincomparisontothedescriptionsderivedfromotherbase-linemodels.Totheseends,asurveywasconductedinwhichparticipantswereaskedto

1.identifyspecifictimeseriespatternbasedontheirlinguisticdescriptionderivedfromtheconceptualspace,and


9.3.1Survey:DesignandProcedureforPatternDataSet

Similartothesurveyusedforleafdataset,themainbodyofthesurveyforpat-ternswascomposedoftwoparts,withthesamedesignsofquestionsexplainedinSection5.3.1.Here,itisworthtomentionthattheevaluationconsideredtheresultsof17unknownpatternsfromapoolofunknownpatternexamples.Moreover,amongallresponsestothesurvey,89validresponseshavebeenstud-iedforthepatterndataset.Mostoftheparticipantswereintherangeof25-44yearsold,andtheyweremostlyeducatedincomputerscienceorequivalent.Besides,mostoftheparticipantswerefluentinEnglishspeaking.

Abouttheexpertiseleveloftheparticipants,theresultsshowthatthepartic-ipantsweremorefamiliarwiththeterminologythathavebeenusedforpatterndataset,ratherthanleaf.AsshowninFigure9.5,fortheleafdataset,20%oftheparticipantsknewnoneofthelexicalitems,70%knewfeworsomeofthem,andonly10%almostallofthem(SeeFigure9.5a).Butforthepatterndataset,30%oftheparticipantsknewfew,orsomeofthelexicalitems,morethan40%knewmostofthem,andmorethan25%knewalltheintroducedter-minology(SeeFigure9.5b).Comparingthepercentagesoftheexpertiselevelshowsthatthelexiconusedinthedescriptionsofthepatternswasmorefamil-iartotheparticipants.

9.3.2IdentifyingPatternObservationsfromLinguistic

Descriptions

Participantswereabletosuccessfullyidentifyalltheunknownobservations(17patterns)withthehelpofthecorrespondingconceptualspacedescriptions.


(a) familiarity with leaf terminology. (b) familiarity with pattern terminology.

Figure 9.5: Pie charts, showing the expertise level of the participants by mea-suring how familiar they are with the introduced terminology for class labelsand features in (a) leaf and (b) pattern data sets. Participants’ choices were: Iam familiar with none/a few/some/most/all of the terminology.

The success rate to identify the correct image for each description in the patterndata set was 79%±11%. As mentioned in Section 5.3, this success rate for leafdata set was 73%±13%. The difference is reasonable to assume that the higherfamiliarity of participants with the pattern data set is a possible explanation forthe better success rates.

For further investigation of the incorrectly identified (i.e., misidentified) ex-amples, the geometrical similarity of these answers to the correct one is calcu-lated in the conceptual space (multi-domain). According to [86], the similarityin conceptual spaces can be calculated by applying Euclidean distance with thedomains and the city-block distance between them. To assess the similarities inconceptual space, also the geometrical similarity of the same instances is calcu-lated but in a full feature space (single-domain) by applying Euclidean distance.Two interesting results have been obtained: First, the misidentified examples arenot uniformly distributed between all possible choices, but instead, participantstended to make similar mistakes. Second, the common misidentified examplesare most of the times (76% for patterns) the closest instance to the correct onein the conceptual space. In the full feature space this was only occasionally true(29% for patterns). This shows that the confused examples with each otherare commonly the nearest instances within the multi-domain conceptual space,which is mostly not true in the full feature space.

The results from the first part of the survey show that the proposed con-ceptual space representation a) is applicable to derive semantic descriptions forunknown pattern observations, and b) is suitable to represent the cognitivelysimilar pattern observations among the multiple domains.


(a)familiaritywithleafterminology.(b)familiaritywithpatternterminology.

Figure9.5:Piecharts,showingtheexpertiseleveloftheparticipantsbymea-suringhowfamiliartheyarewiththeintroducedterminologyforclasslabelsandfeaturesin(a)leafand(b)patterndatasets.Participants’choiceswere:Iamfamiliarwithnone/afew/some/most/alloftheterminology.

Thesuccessratetoidentifythecorrectimageforeachdescriptioninthepatterndatasetwas79%±11%.AsmentionedinSection5.3,thissuccessrateforleafdatasetwas73%±13%.Thedifferenceisreasonabletoassumethatthehigherfamiliarityofparticipantswiththepatterndatasetisapossibleexplanationforthebettersuccessrates.

Forfurtherinvestigationoftheincorrectlyidentified(i.e.,misidentified)ex-amples,thegeometricalsimilarityoftheseanswerstothecorrectoneiscalcu-latedintheconceptualspace(multi-domain).Accordingto[86],thesimilarityinconceptualspacescanbecalculatedbyapplyingEuclideandistancewiththedomainsandthecity-blockdistancebetweenthem.Toassessthesimilaritiesinconceptualspace,alsothegeometricalsimilarityofthesameinstancesiscalcu-latedbutinafullfeaturespace(single-domain)byapplyingEuclideandistance.Twointerestingresultshavebeenobtained:First,themisidentifiedexamplesarenotuniformlydistributedbetweenallpossiblechoices,butinstead,participantstendedtomakesimilarmistakes.Second,thecommonmisidentifiedexamplesaremostofthetimes(76%forpatterns)theclosestinstancetothecorrectoneintheconceptualspace.Inthefullfeaturespacethiswasonlyoccasionallytrue(29%forpatterns).Thisshowsthattheconfusedexampleswitheachotherarecommonlythenearestinstanceswithinthemulti-domainconceptualspace,whichismostlynottrueinthefullfeaturespace.

Theresultsfromthefirstpartofthesurveyshowthattheproposedcon-ceptualspacerepresentationa)isapplicabletoderivesemanticdescriptionsforunknownpatternobservations,andb)issuitabletorepresentthecognitivelysimilarpatternobservationsamongthemultipledomains.


(a) familiarity with leaf terminology. (b) familiarity with pattern terminology.

Figure 9.5: Pie charts, showing the expertise level of the participants by mea-suring how familiar they are with the introduced terminology for class labelsand features in (a) leaf and (b) pattern data sets. Participants’ choices were: Iam familiar with none/a few/some/most/all of the terminology.

The success rate to identify the correct image for each description in the patterndata set was 79%±11%. As mentioned in Section 5.3, this success rate for leafdata set was 73%±13%. The difference is reasonable to assume that the higherfamiliarity of participants with the pattern data set is a possible explanation forthe better success rates.

For further investigation of the incorrectly identified (i.e., misidentified) ex-amples, the geometrical similarity of these answers to the correct one is calcu-lated in the conceptual space (multi-domain). According to [86], the similarityin conceptual spaces can be calculated by applying Euclidean distance with thedomains and the city-block distance between them. To assess the similarities inconceptual space, also the geometrical similarity of the same instances is calcu-lated but in a full feature space (single-domain) by applying Euclidean distance.Two interesting results have been obtained: First, the misidentified examples arenot uniformly distributed between all possible choices, but instead, participantstended to make similar mistakes. Second, the common misidentified examplesare most of the times (76% for patterns) the closest instance to the correct onein the conceptual space. In the full feature space this was only occasionally true(29% for patterns). This shows that the confused examples with each otherare commonly the nearest instances within the multi-domain conceptual space,which is mostly not true in the full feature space.

The results from the first part of the survey show that the proposed con-ceptual space representation a) is applicable to derive semantic descriptions forunknown pattern observations, and b) is suitable to represent the cognitivelysimilar pattern observations among the multiple domains.


(a)familiaritywithleafterminology.(b)familiaritywithpatternterminology.

Figure9.5:Piecharts,showingtheexpertiseleveloftheparticipantsbymea-suringhowfamiliartheyarewiththeintroducedterminologyforclasslabelsandfeaturesin(a)leafand(b)patterndatasets.Participants’choiceswere:Iamfamiliarwithnone/afew/some/most/alloftheterminology.

Thesuccessratetoidentifythecorrectimageforeachdescriptioninthepatterndatasetwas79%±11%.AsmentionedinSection5.3,thissuccessrateforleafdatasetwas73%±13%.Thedifferenceisreasonabletoassumethatthehigherfamiliarityofparticipantswiththepatterndatasetisapossibleexplanationforthebettersuccessrates.

Forfurtherinvestigationoftheincorrectlyidentified(i.e.,misidentified)ex-amples,thegeometricalsimilarityoftheseanswerstothecorrectoneiscalcu-latedintheconceptualspace(multi-domain).Accordingto[86],thesimilarityinconceptualspacescanbecalculatedbyapplyingEuclideandistancewiththedomainsandthecity-blockdistancebetweenthem.Toassessthesimilaritiesinconceptualspace,alsothegeometricalsimilarityofthesameinstancesiscalcu-latedbutinafullfeaturespace(single-domain)byapplyingEuclideandistance.Twointerestingresultshavebeenobtained:First,themisidentifiedexamplesarenotuniformlydistributedbetweenallpossiblechoices,butinstead,participantstendedtomakesimilarmistakes.Second,thecommonmisidentifiedexamplesaremostofthetimes(76%forpatterns)theclosestinstancetothecorrectoneintheconceptualspace.Inthefullfeaturespacethiswasonlyoccasionallytrue(29%forpatterns).Thisshowsthattheconfusedexampleswitheachotherarecommonlythenearestinstanceswithinthemulti-domainconceptualspace,whichismostlynottrueinthefullfeaturespace.

Theresultsfromthefirstpartofthesurveyshowthattheproposedcon-ceptualspacerepresentationa)isapplicabletoderivesemanticdescriptionsforunknownpatternobservations,andb)issuitabletorepresentthecognitivelysimilarpatternobservationsamongthemultipledomains.

9.3. EVALUATION OF DESCRIPTIONS 149

Table 9.2: The overall scores calculated from the rating responses to the differ-ent models in pattern data set. The numbers show average scores (and standarddeviations) in the range of 1 to 5.



Figure 9.6: The box plot of the rating scores received for each of the modelsderiving descriptions in pattern data set.

9.3.3 Rating Various Linguistic Descriptions of a Pattern

Observation

In the results of the rating scale questions, the description derived from the con-ceptual space model is compared with the descriptions derived from the twoother semantic models that are explained in details in Section 5.3.3. Table 9.2shows the statistical summary of the rating scores received for the descriptionsderived from each of the approaches (Conceptual, Generative and Discrimina-tive) in pattern data set. Also, these scores are depicted in the form of box plotin Figure 9.6.

Similar to the analysis for leaf data set, an ANOVA test has been applied toshow that the conceptual space description (Conceptual) is significantly prefer-able rated than the two alternatives (Generative and Discriminative). The one-way ANOVA test showed a significant effect of the models on the scores. Forthe pattern data set, Conceptual has the mean significantly different from Gen-

9.3.EVALUATIONOFDESCRIPTIONS149

Table9.2:Theoverallscorescalculatedfromtheratingresponsestothediffer-entmodelsinpatterndataset.Thenumbersshowaveragescores(andstandarddeviations)intherangeof1to5.



Figure9.6:Theboxplotoftheratingscoresreceivedforeachofthemodelsderivingdescriptionsinpatterndataset.

9.3.3RatingVariousLinguisticDescriptionsofaPattern

Observation

Intheresultsoftheratingscalequestions,thedescriptionderivedfromthecon-ceptualspacemodeliscomparedwiththedescriptionsderivedfromthetwoothersemanticmodelsthatareexplainedindetailsinSection5.3.3.Table9.2showsthestatisticalsummaryoftheratingscoresreceivedforthedescriptionsderivedfromeachoftheapproaches(Conceptual,GenerativeandDiscrimina-tive)inpatterndataset.Also,thesescoresaredepictedintheformofboxplotinFigure9.6.

Similartotheanalysisforleafdataset,anANOVAtesthasbeenappliedtoshowthattheconceptualspacedescription(Conceptual)issignificantlyprefer-ableratedthanthetwoalternatives(GenerativeandDiscriminative).Theone-wayANOVAtestshowedasignificanteffectofthemodelsonthescores.Forthepatterndataset,ConceptualhasthemeansignificantlydifferentfromGen-

9.3. EVALUATION OF DESCRIPTIONS 149

Table 9.2: The overall scores calculated from the rating responses to the differ-ent models in pattern data set. The numbers show average scores (and standarddeviations) in the range of 1 to 5.



Figure 9.6: The box plot of the rating scores received for each of the modelsderiving descriptions in pattern data set.

9.3.3 Rating Various Linguistic Descriptions of a Pattern

Observation

In the results of the rating scale questions, the description derived from the con-ceptual space model is compared with the descriptions derived from the twoother semantic models that are explained in details in Section 5.3.3. Table 9.2shows the statistical summary of the rating scores received for the descriptionsderived from each of the approaches (Conceptual, Generative and Discrimina-tive) in pattern data set. Also, these scores are depicted in the form of box plotin Figure 9.6.

Similar to the analysis for leaf data set, an ANOVA test has been applied toshow that the conceptual space description (Conceptual) is significantly prefer-able rated than the two alternatives (Generative and Discriminative). The one-way ANOVA test showed a significant effect of the models on the scores. Forthe pattern data set, Conceptual has the mean significantly different from Gen-

9.3.EVALUATIONOFDESCRIPTIONS149

Table9.2:Theoverallscorescalculatedfromtheratingresponsestothediffer-entmodelsinpatterndataset.Thenumbersshowaveragescores(andstandarddeviations)intherangeof1to5.



Figure9.6:Theboxplotoftheratingscoresreceivedforeachofthemodelsderivingdescriptionsinpatterndataset.

9.3.3RatingVariousLinguisticDescriptionsofaPattern

Observation

Intheresultsoftheratingscalequestions,thedescriptionderivedfromthecon-ceptualspacemodeliscomparedwiththedescriptionsderivedfromthetwoothersemanticmodelsthatareexplainedindetailsinSection5.3.3.Table9.2showsthestatisticalsummaryoftheratingscoresreceivedforthedescriptionsderivedfromeachoftheapproaches(Conceptual,GenerativeandDiscrimina-tive)inpatterndataset.Also,thesescoresaredepictedintheformofboxplotinFigure9.6.

Similartotheanalysisforleafdataset,anANOVAtesthasbeenappliedtoshowthattheconceptualspacedescription(Conceptual)issignificantlyprefer-ableratedthanthetwoalternatives(GenerativeandDiscriminative).Theone-wayANOVAtestshowedasignificanteffectofthemodelsonthescores.Forthepatterndataset,ConceptualhasthemeansignificantlydifferentfromGen-



pattern data set


Generative & DiscriminativeF(2, 1173) = 23.72,

p < 10−13

Wilcoxon TestConceptual vs. Generative p < 10−12

Conceptual vs. Discriminative p < 10−07

Generative vs. Discriminative p > 0.05 (∗)

erative and Discriminative, p < .0001 (two-tailed). The details of the test hasbeen shown in Table 9.3.

Moreover, since the ratings are ordinal, also a non-parametric test (i.e.,Wilcoxon Test) is carried out to identify the significant differences between rat-ings by comparing each pair of the scores. Table 9.3 shows the p-values of thismethod for each pair of models. The output showed that Conceptual model issignificantly different from Generative and Discriminative models (p < .0001).Beside the significant difference of conceptual model, an interesting outcomeof the tests is the scores of Generative and Discriminative for two data sets.In the leaf data set, participants have given higher scores to Generative thanDiscriminative (p < .0001). But in the pattern data set, there is no significantdifference between these two models (p > .05). It can be interpreted that besideConceptual which is the most preferred description, subjects preferred to see allthe features as the description for leaf samples. But about the pattern samples,there is no preference between the descriptions including either the features orthe concept labels related to the shown patterns.

Overall, the results from this part of the survey show that the proposedconceptual space representation a) is an appropriate semantic inference modelto derive linguistic descriptions for unknown pattern observations, and b) suc-cessfully derives descriptions (from multi-domain space) that are naturally pre-ferred by participants, in comparison to the other alternative models (fromsingle-domain space).


This chapter has presented the process of applying the semantic representa-tion proposed in Part I on the physiological pattern data sets. This processautomatically constructs the conceptual space of the abstracted physiologicalpatterns, and then, it infers semantic descriptions for a set of unknown pat-terns. The constructed conceptual space of patterns involves two domains thatinclude five quality dimensions in total. This space represents four class of pat-terns: Increasing, Decreasing, Spike, and Oscillation. As it was expected beforeapplying the data-driven approach to construct the conceptual spaces, the con-cepts of increasing and decreasing have been represented in the same domain.



patterndataset



p<10−13


−12


Generativevs.Discriminativep>0.05(∗)

erativeandDiscriminative,p<.0001(two-tailed).ThedetailsofthetesthasbeenshowninTable9.3.

Moreover,sincetheratingsareordinal,alsoanon-parametrictest(i.e.,WilcoxonTest)iscarriedouttoidentifythesignificantdifferencesbetweenrat-ingsbycomparingeachpairofthescores.Table9.3showsthep-valuesofthismethodforeachpairofmodels.TheoutputshowedthatConceptualmodelissignificantlydifferentfromGenerativeandDiscriminativemodels(p<.0001).Besidethesignificantdifferenceofconceptualmodel,aninterestingoutcomeofthetestsisthescoresofGenerativeandDiscriminativefortwodatasets.Intheleafdataset,participantshavegivenhigherscorestoGenerativethanDiscriminative(p<.0001).Butinthepatterndataset,thereisnosignificantdifferencebetweenthesetwomodels(p>.05).ItcanbeinterpretedthatbesideConceptualwhichisthemostpreferreddescription,subjectspreferredtoseeallthefeaturesasthedescriptionforleafsamples.Butaboutthepatternsamples,thereisnopreferencebetweenthedescriptionsincludingeitherthefeaturesortheconceptlabelsrelatedtotheshownpatterns.

Overall,theresultsfromthispartofthesurveyshowthattheproposedconceptualspacerepresentationa)isanappropriatesemanticinferencemodeltoderivelinguisticdescriptionsforunknownpatternobservations,andb)suc-cessfullyderivesdescriptions(frommulti-domainspace)thatarenaturallypre-ferredbyparticipants,incomparisontotheotheralternativemodels(fromsingle-domainspace).


Thischapterhaspresentedtheprocessofapplyingthesemanticrepresenta-tionproposedinPartIonthephysiologicalpatterndatasets.Thisprocessautomaticallyconstructstheconceptualspaceoftheabstractedphysiologicalpatterns,andthen,itinferssemanticdescriptionsforasetofunknownpat-terns.Theconstructedconceptualspaceofpatternsinvolvestwodomainsthatincludefivequalitydimensionsintotal.Thisspacerepresentsfourclassofpat-terns:Increasing,Decreasing,Spike,andOscillation.Asitwasexpectedbeforeapplyingthedata-drivenapproachtoconstructtheconceptualspaces,thecon-ceptsofincreasinganddecreasinghavebeenrepresentedinthesamedomain.



pattern data set


Generative & DiscriminativeF(2, 1173) = 23.72,

p < 10−13

Wilcoxon TestConceptual vs. Generative p < 10−12

Conceptual vs. Discriminative p < 10−07

Generative vs. Discriminative p > 0.05 (∗)

erative and Discriminative, p < .0001 (two-tailed). The details of the test hasbeen shown in Table 9.3.

Moreover, since the ratings are ordinal, also a non-parametric test (i.e.,Wilcoxon Test) is carried out to identify the significant differences between rat-ings by comparing each pair of the scores. Table 9.3 shows the p-values of thismethod for each pair of models. The output showed that Conceptual model issignificantly different from Generative and Discriminative models (p < .0001).Beside the significant difference of conceptual model, an interesting outcomeof the tests is the scores of Generative and Discriminative for two data sets.In the leaf data set, participants have given higher scores to Generative thanDiscriminative (p < .0001). But in the pattern data set, there is no significantdifference between these two models (p > .05). It can be interpreted that besideConceptual which is the most preferred description, subjects preferred to see allthe features as the description for leaf samples. But about the pattern samples,there is no preference between the descriptions including either the features orthe concept labels related to the shown patterns.

Overall, the results from this part of the survey show that the proposedconceptual space representation a) is an appropriate semantic inference modelto derive linguistic descriptions for unknown pattern observations, and b) suc-cessfully derives descriptions (from multi-domain space) that are naturally pre-ferred by participants, in comparison to the other alternative models (fromsingle-domain space).


This chapter has presented the process of applying the semantic representa-tion proposed in Part I on the physiological pattern data sets. This processautomatically constructs the conceptual space of the abstracted physiologicalpatterns, and then, it infers semantic descriptions for a set of unknown pat-terns. The constructed conceptual space of patterns involves two domains thatinclude five quality dimensions in total. This space represents four class of pat-terns: Increasing, Decreasing, Spike, and Oscillation. As it was expected beforeapplying the data-driven approach to construct the conceptual spaces, the con-cepts of increasing and decreasing have been represented in the same domain.



patterndataset



p<10−13


−12


Generativevs.Discriminativep>0.05(∗)

erativeandDiscriminative,p<.0001(two-tailed).ThedetailsofthetesthasbeenshowninTable9.3.

Moreover,sincetheratingsareordinal,alsoanon-parametrictest(i.e.,WilcoxonTest)iscarriedouttoidentifythesignificantdifferencesbetweenrat-ingsbycomparingeachpairofthescores.Table9.3showsthep-valuesofthismethodforeachpairofmodels.TheoutputshowedthatConceptualmodelissignificantlydifferentfromGenerativeandDiscriminativemodels(p<.0001).Besidethesignificantdifferenceofconceptualmodel,aninterestingoutcomeofthetestsisthescoresofGenerativeandDiscriminativefortwodatasets.Intheleafdataset,participantshavegivenhigherscorestoGenerativethanDiscriminative(p<.0001).Butinthepatterndataset,thereisnosignificantdifferencebetweenthesetwomodels(p>.05).ItcanbeinterpretedthatbesideConceptualwhichisthemostpreferreddescription,subjectspreferredtoseeallthefeaturesasthedescriptionforleafsamples.Butaboutthepatternsamples,thereisnopreferencebetweenthedescriptionsincludingeitherthefeaturesortheconceptlabelsrelatedtotheshownpatterns.

Overall,theresultsfromthispartofthesurveyshowthattheproposedconceptualspacerepresentationa)isanappropriatesemanticinferencemodeltoderivelinguisticdescriptionsforunknownpatternobservations,andb)suc-cessfullyderivesdescriptions(frommulti-domainspace)thatarenaturallypre-ferredbyparticipants,incomparisontotheotheralternativemodels(fromsingle-domainspace).


Thischapterhaspresentedtheprocessofapplyingthesemanticrepresenta-tionproposedinPartIonthephysiologicalpatterndatasets.Thisprocessautomaticallyconstructstheconceptualspaceoftheabstractedphysiologicalpatterns,andthen,itinferssemanticdescriptionsforasetofunknownpat-terns.Theconstructedconceptualspaceofpatternsinvolvestwodomainsthatincludefivequalitydimensionsintotal.Thisspacerepresentsfourclassofpat-terns:Increasing,Decreasing,Spike,andOscillation.Asitwasexpectedbeforeapplyingthedata-drivenapproachtoconstructtheconceptualspaces,thecon-ceptsofincreasinganddecreasinghavebeenrepresentedinthesamedomain.


Beside the ‘slope’ feature, the other selected quality dimension (i.e., ‘start-enddiff ’) to represent these concepts was not the obvious choice. However, it seemsthat the values of this feature were dominant enough to play the distinguish theIncreasing and Decreasing concepts from the rest of the concepts. The otherdomain that represents Spike and Oscillation concepts also contains interest-ing quality dimensions. The ‘time interval’, ‘min-max diff ’, and ‘frequency’ arethe features that are more or less the obvious choices to distinguish or describethese two concepts. However, as one can see, some features like ‘entropy’ or‘standard deviation’ have not been appeared in the conceptual space, meaningthat their values were not good enough (in comparison with the other features)to separate any concept from the rest.

Furthermore, this chapter has performed an empirical evaluation, similar tothe assessment presented in Chapter 5, to show the goodness of the conceptualspace of the patterns used for text generation. This assessment has conducteda survey by asking the human subjects to identify the instances by reading thegenerated texts by conceptual spaces model, along with rating three differentgenerated texts by different semantic models to compare the output text of thephysiological patterns.

In sum, this chapter has shown the ability of the semantic representationapproach proposed in Part I to infer linguistic descriptions for physiologicalsensor patterns, which are not necessarily known for the end user. An empiricalevaluations was performed to evaluate the goodness of the generated texts.


Besidethe‘slope’feature,theotherselectedqualitydimension(i.e.,‘start-enddiff’)torepresenttheseconceptswasnottheobviouschoice.However,itseemsthatthevaluesofthisfeatureweredominantenoughtoplaythedistinguishtheIncreasingandDecreasingconceptsfromtherestoftheconcepts.TheotherdomainthatrepresentsSpikeandOscillationconceptsalsocontainsinterest-ingqualitydimensions.The‘timeinterval’,‘min-maxdiff’,and‘frequency’arethefeaturesthataremoreorlesstheobviouschoicestodistinguishordescribethesetwoconcepts.However,asonecansee,somefeatureslike‘entropy’or‘standarddeviation’havenotbeenappearedintheconceptualspace,meaningthattheirvalueswerenotgoodenough(incomparisonwiththeotherfeatures)toseparateanyconceptfromtherest.

Furthermore,thischapterhasperformedanempiricalevaluation,similartotheassessmentpresentedinChapter5,toshowthegoodnessoftheconceptualspaceofthepatternsusedfortextgeneration.Thisassessmenthasconductedasurveybyaskingthehumansubjectstoidentifytheinstancesbyreadingthegeneratedtextsbyconceptualspacesmodel,alongwithratingthreedifferentgeneratedtextsbydifferentsemanticmodelstocomparetheoutputtextofthephysiologicalpatterns.

Insum,thischapterhasshowntheabilityofthesemanticrepresentationapproachproposedinPartItoinferlinguisticdescriptionsforphysiologicalsensorpatterns,whicharenotnecessarilyknownfortheenduser.Anempiricalevaluationswasperformedtoevaluatethegoodnessofthegeneratedtexts.


Beside the ‘slope’ feature, the other selected quality dimension (i.e., ‘start-enddiff ’) to represent these concepts was not the obvious choice. However, it seemsthat the values of this feature were dominant enough to play the distinguish theIncreasing and Decreasing concepts from the rest of the concepts. The otherdomain that represents Spike and Oscillation concepts also contains interest-ing quality dimensions. The ‘time interval’, ‘min-max diff ’, and ‘frequency’ arethe features that are more or less the obvious choices to distinguish or describethese two concepts. However, as one can see, some features like ‘entropy’ or‘standard deviation’ have not been appeared in the conceptual space, meaningthat their values were not good enough (in comparison with the other features)to separate any concept from the rest.

Furthermore, this chapter has performed an empirical evaluation, similar tothe assessment presented in Chapter 5, to show the goodness of the conceptualspace of the patterns used for text generation. This assessment has conducteda survey by asking the human subjects to identify the instances by reading thegenerated texts by conceptual spaces model, along with rating three differentgenerated texts by different semantic models to compare the output text of thephysiological patterns.

In sum, this chapter has shown the ability of the semantic representationapproach proposed in Part I to infer linguistic descriptions for physiologicalsensor patterns, which are not necessarily known for the end user. An empiricalevaluations was performed to evaluate the goodness of the generated texts.


Besidethe‘slope’feature,theotherselectedqualitydimension(i.e.,‘start-enddiff’)torepresenttheseconceptswasnottheobviouschoice.However,itseemsthatthevaluesofthisfeatureweredominantenoughtoplaythedistinguishtheIncreasingandDecreasingconceptsfromtherestoftheconcepts.TheotherdomainthatrepresentsSpikeandOscillationconceptsalsocontainsinterest-ingqualitydimensions.The‘timeinterval’,‘min-maxdiff’,and‘frequency’arethefeaturesthataremoreorlesstheobviouschoicestodistinguishordescribethesetwoconcepts.However,asonecansee,somefeatureslike‘entropy’or‘standarddeviation’havenotbeenappearedintheconceptualspace,meaningthattheirvalueswerenotgoodenough(incomparisonwiththeotherfeatures)toseparateanyconceptfromtherest.

Furthermore,thischapterhasperformedanempiricalevaluation,similartotheassessmentpresentedinChapter5,toshowthegoodnessoftheconceptualspaceofthepatternsusedfortextgeneration.Thisassessmenthasconductedasurveybyaskingthehumansubjectstoidentifytheinstancesbyreadingthegeneratedtextsbyconceptualspacesmodel,alongwithratingthreedifferentgeneratedtextsbydifferentsemanticmodelstocomparetheoutputtextofthephysiologicalpatterns.

Insum,thischapterhasshowntheabilityofthesemanticrepresentationapproachproposedinPartItoinferlinguisticdescriptionsforphysiologicalsensorpatterns,whicharenotnecessarilyknownfortheenduser.Anempiricalevaluationswasperformedtoevaluatethegoodnessofthegeneratedtexts.

Chapter 10

Conclusions

“And you may ask yourself: Well, how did I get here?”— Talking Heads (1975–1991)

This chapter summarises the contributions of this thesis. It provides anoverview of the limitations of different aspects presented in the thesis.

Finally, the chapter concludes by presenting an outlook of the future researchin different directions.

10.1 Summary of Contributions

The overall contribution of this thesis has been to provide different data-drivenstrategies for mining and representing numerical data into a semantic repre-sentation and then generate linguistic descriptions for such information. Theprocess, in summary, has been presented in three steps: 1) extracting and min-ing numerical information (e.g., physiological sensor data), 2) modelling theinformation in a semantic representation, and 3) generating linguistic descrip-tions for such information.

To present the achievements of this thesis, the rest of this section revis-its the introduced contributions (C1 to C4), given in Chapter 1 by explaininghow each of the contributions has been accomplished using the proposed ap-proaches.

10.1.1 Construction of Conceptual Spaces (C1)

This thesis has first introduced a data-driven approach to automatically con-struct conceptual spaces based on the input observations and their semanticattributes (Chapter 3). The following contributions have addressed the task ofconstructing a conceptual space:

153

Chapter10

Conclusions

“Andyoumayaskyourself:Well,howdidIgethere?”—TalkingHeads(1975–1991)

Thischaptersummarisesthecontributionsofthisthesis.Itprovidesanoverviewofthelimitationsofdifferentaspectspresentedinthethesis.

Finally,thechapterconcludesbypresentinganoutlookofthefutureresearchindifferentdirections.

10.1SummaryofContributions

Theoverallcontributionofthisthesishasbeentoprovidedifferentdata-drivenstrategiesforminingandrepresentingnumericaldataintoasemanticrepre-sentationandthengeneratelinguisticdescriptionsforsuchinformation.Theprocess,insummary,hasbeenpresentedinthreesteps:1)extractingandmin-ingnumericalinformation(e.g.,physiologicalsensordata),2)modellingtheinformationinasemanticrepresentation,and3)generatinglinguisticdescrip-tionsforsuchinformation.

Topresenttheachievementsofthisthesis,therestofthissectionrevis-itstheintroducedcontributions(C1toC4),giveninChapter1byexplaininghoweachofthecontributionshasbeenaccomplishedusingtheproposedap-proaches.

10.1.1ConstructionofConceptualSpaces(C1)

Thisthesishasfirstintroducedadata-drivenapproachtoautomaticallycon-structconceptualspacesbasedontheinputobservationsandtheirsemanticattributes(Chapter3).Thefollowingcontributionshaveaddressedthetaskofconstructingaconceptualspace:

153

Chapter 10

Conclusions

“And you may ask yourself: Well, how did I get here?”— Talking Heads (1975–1991)

This chapter summarises the contributions of this thesis. It provides anoverview of the limitations of different aspects presented in the thesis.

Finally, the chapter concludes by presenting an outlook of the future researchin different directions.

10.1 Summary of Contributions

The overall contribution of this thesis has been to provide different data-drivenstrategies for mining and representing numerical data into a semantic repre-sentation and then generate linguistic descriptions for such information. Theprocess, in summary, has been presented in three steps: 1) extracting and min-ing numerical information (e.g., physiological sensor data), 2) modelling theinformation in a semantic representation, and 3) generating linguistic descrip-tions for such information.

To present the achievements of this thesis, the rest of this section revis-its the introduced contributions (C1 to C4), given in Chapter 1 by explaininghow each of the contributions has been accomplished using the proposed ap-proaches.

10.1.1 Construction of Conceptual Spaces (C1)

This thesis has first introduced a data-driven approach to automatically con-struct conceptual spaces based on the input observations and their semanticattributes (Chapter 3). The following contributions have addressed the task ofconstructing a conceptual space:

153

Chapter10

Conclusions

“Andyoumayaskyourself:Well,howdidIgethere?”—TalkingHeads(1975–1991)

Thischaptersummarisesthecontributionsofthisthesis.Itprovidesanoverviewofthelimitationsofdifferentaspectspresentedinthethesis.

Finally,thechapterconcludesbypresentinganoutlookofthefutureresearchindifferentdirections.

10.1SummaryofContributions

Theoverallcontributionofthisthesishasbeentoprovidedifferentdata-drivenstrategiesforminingandrepresentingnumericaldataintoasemanticrepre-sentationandthengeneratelinguisticdescriptionsforsuchinformation.Theprocess,insummary,hasbeenpresentedinthreesteps:1)extractingandmin-ingnumericalinformation(e.g.,physiologicalsensordata),2)modellingtheinformationinasemanticrepresentation,and3)generatinglinguisticdescrip-tionsforsuchinformation.

Topresenttheachievementsofthisthesis,therestofthissectionrevis-itstheintroducedcontributions(C1toC4),giveninChapter1byexplaininghoweachofthecontributionshasbeenaccomplishedusingtheproposedap-proaches.

10.1.1ConstructionofConceptualSpaces(C1)

Thisthesishasfirstintroducedadata-drivenapproachtoautomaticallycon-structconceptualspacesbasedontheinputobservationsandtheirsemanticattributes(Chapter3).Thefollowingcontributionshaveaddressedthetaskofconstructingaconceptualspace:

153

154 CHAPTER 10. CONCLUSIONS

1) The specification of domains and quality dimensions was presented usingfeature selection and feature grouping methods, which exploit the relevance ofthe selected features to the known concepts. The approach has applied a featureselection method based on mutual information for feature selection (MIFS) torank the relevance of features and concepts. This ranking algorithm has beenapplied to each concept individually by applying the MIFS method on the inputdata set. After that, grouping the selected features has been introduced throughfirst constructing a bipartite graph representation of the feature-concept asso-ciations, and then exploiting the most representative subsets of features as thedomains by finding the best bicliques in such graph.

2) The concept representation was described in an instance-based manner.To form the concepts within the specified domains, the approach has calculatedthe convex regions of concepts and the salient weights of concept in relation tothe quality dimensions of the domains. Note that this calculation was entirelyformulated based on the associated observations to the concepts, without in-volving external knowledge.

A key finding in this contribution is that the proposed approach to constructconceptual spaces provides a generalisation for concept representation, wherethis representation can be derived from different types of input instances.

10.1.2 Semantic Inference in Conceptual Spaces (C2)

This thesis has introduced a semantic inference process to linguistically rep-resent a new observation within the built conceptual space (Chapter 4). Thefollowing contributions have addressed the task of semantic inferring in a con-ceptual space:

1) A symbol space was introduced as the complementary space to the con-ceptual space, which includes the semantics of the corresponding concepts andquality dimensions. This space enables the approach to determine the relevantsymbolic terms to represent a new observation.

2) The inference process to generate linguistic descriptions for an unknownobservation was presented in two phases: To determine associated concepts andquality dimensions of an unknown instance, its location within the constructedconceptual space has been investigated. This determination has been done byconsidering the inclusion of the instance in the regions of the space and the useof similarity measures in such space. After that, the lexicalisation of the instancehas been induced by extracting the semantic labels of the associated conceptsand quality dimensions. Finally, microplanning and realisation techniques havebeen applied in order to generate natural language descriptions.

One advantage of such inference model is that the proposed approach con-cerns which features and concepts should be inferred as the most suitable set ofinterpretable contents to describe an unknown observation.

154CHAPTER10.CONCLUSIONS

1)Thespecificationofdomainsandqualitydimensionswaspresentedusingfeatureselectionandfeaturegroupingmethods,whichexploittherelevanceoftheselectedfeaturestotheknownconcepts.Theapproachhasappliedafeatureselectionmethodbasedonmutualinformationforfeatureselection(MIFS)toranktherelevanceoffeaturesandconcepts.ThisrankingalgorithmhasbeenappliedtoeachconceptindividuallybyapplyingtheMIFSmethodontheinputdataset.Afterthat,groupingtheselectedfeatureshasbeenintroducedthroughfirstconstructingabipartitegraphrepresentationofthefeature-conceptasso-ciations,andthenexploitingthemostrepresentativesubsetsoffeaturesasthedomainsbyfindingthebestbicliquesinsuchgraph.

2)Theconceptrepresentationwasdescribedinaninstance-basedmanner.Toformtheconceptswithinthespecifieddomains,theapproachhascalculatedtheconvexregionsofconceptsandthesalientweightsofconceptinrelationtothequalitydimensionsofthedomains.Notethatthiscalculationwasentirelyformulatedbasedontheassociatedobservationstotheconcepts,withoutin-volvingexternalknowledge.

Akeyfindinginthiscontributionisthattheproposedapproachtoconstructconceptualspacesprovidesageneralisationforconceptrepresentation,wherethisrepresentationcanbederivedfromdifferenttypesofinputinstances.

10.1.2SemanticInferenceinConceptualSpaces(C2)

Thisthesishasintroducedasemanticinferenceprocesstolinguisticallyrep-resentanewobservationwithinthebuiltconceptualspace(Chapter4).Thefollowingcontributionshaveaddressedthetaskofsemanticinferringinacon-ceptualspace:

1)Asymbolspacewasintroducedasthecomplementaryspacetothecon-ceptualspace,whichincludesthesemanticsofthecorrespondingconceptsandqualitydimensions.Thisspaceenablestheapproachtodeterminetherelevantsymbolictermstorepresentanewobservation.

2)Theinferenceprocesstogeneratelinguisticdescriptionsforanunknownobservationwaspresentedintwophases:Todetermineassociatedconceptsandqualitydimensionsofanunknowninstance,itslocationwithintheconstructedconceptualspacehasbeeninvestigated.Thisdeterminationhasbeendonebyconsideringtheinclusionoftheinstanceintheregionsofthespaceandtheuseofsimilaritymeasuresinsuchspace.Afterthat,thelexicalisationoftheinstancehasbeeninducedbyextractingthesemanticlabelsoftheassociatedconceptsandqualitydimensions.Finally,microplanningandrealisationtechniqueshavebeenappliedinordertogeneratenaturallanguagedescriptions.

Oneadvantageofsuchinferencemodelisthattheproposedapproachcon-cernswhichfeaturesandconceptsshouldbeinferredasthemostsuitablesetofinterpretablecontentstodescribeanunknownobservation.


1) The specification of domains and quality dimensions was presented usingfeature selection and feature grouping methods, which exploit the relevance ofthe selected features to the known concepts. The approach has applied a featureselection method based on mutual information for feature selection (MIFS) torank the relevance of features and concepts. This ranking algorithm has beenapplied to each concept individually by applying the MIFS method on the inputdata set. After that, grouping the selected features has been introduced throughfirst constructing a bipartite graph representation of the feature-concept asso-ciations, and then exploiting the most representative subsets of features as thedomains by finding the best bicliques in such graph.

2) The concept representation was described in an instance-based manner.To form the concepts within the specified domains, the approach has calculatedthe convex regions of concepts and the salient weights of concept in relation tothe quality dimensions of the domains. Note that this calculation was entirelyformulated based on the associated observations to the concepts, without in-volving external knowledge.

A key finding in this contribution is that the proposed approach to constructconceptual spaces provides a generalisation for concept representation, wherethis representation can be derived from different types of input instances.

10.1.2 Semantic Inference in Conceptual Spaces (C2)

This thesis has introduced a semantic inference process to linguistically rep-resent a new observation within the built conceptual space (Chapter 4). Thefollowing contributions have addressed the task of semantic inferring in a con-ceptual space:

1) A symbol space was introduced as the complementary space to the con-ceptual space, which includes the semantics of the corresponding concepts andquality dimensions. This space enables the approach to determine the relevantsymbolic terms to represent a new observation.

2) The inference process to generate linguistic descriptions for an unknownobservation was presented in two phases: To determine associated concepts andquality dimensions of an unknown instance, its location within the constructedconceptual space has been investigated. This determination has been done byconsidering the inclusion of the instance in the regions of the space and the useof similarity measures in such space. After that, the lexicalisation of the instancehas been induced by extracting the semantic labels of the associated conceptsand quality dimensions. Finally, microplanning and realisation techniques havebeen applied in order to generate natural language descriptions.

One advantage of such inference model is that the proposed approach con-cerns which features and concepts should be inferred as the most suitable set ofinterpretable contents to describe an unknown observation.


1)Thespecificationofdomainsandqualitydimensionswaspresentedusingfeatureselectionandfeaturegroupingmethods,whichexploittherelevanceoftheselectedfeaturestotheknownconcepts.Theapproachhasappliedafeatureselectionmethodbasedonmutualinformationforfeatureselection(MIFS)toranktherelevanceoffeaturesandconcepts.ThisrankingalgorithmhasbeenappliedtoeachconceptindividuallybyapplyingtheMIFSmethodontheinputdataset.Afterthat,groupingtheselectedfeatureshasbeenintroducedthroughfirstconstructingabipartitegraphrepresentationofthefeature-conceptasso-ciations,andthenexploitingthemostrepresentativesubsetsoffeaturesasthedomainsbyfindingthebestbicliquesinsuchgraph.

2)Theconceptrepresentationwasdescribedinaninstance-basedmanner.Toformtheconceptswithinthespecifieddomains,theapproachhascalculatedtheconvexregionsofconceptsandthesalientweightsofconceptinrelationtothequalitydimensionsofthedomains.Notethatthiscalculationwasentirelyformulatedbasedontheassociatedobservationstotheconcepts,withoutin-volvingexternalknowledge.

Akeyfindinginthiscontributionisthattheproposedapproachtoconstructconceptualspacesprovidesageneralisationforconceptrepresentation,wherethisrepresentationcanbederivedfromdifferenttypesofinputinstances.

10.1.2SemanticInferenceinConceptualSpaces(C2)

Thisthesishasintroducedasemanticinferenceprocesstolinguisticallyrep-resentanewobservationwithinthebuiltconceptualspace(Chapter4).Thefollowingcontributionshaveaddressedthetaskofsemanticinferringinacon-ceptualspace:

1)Asymbolspacewasintroducedasthecomplementaryspacetothecon-ceptualspace,whichincludesthesemanticsofthecorrespondingconceptsandqualitydimensions.Thisspaceenablestheapproachtodeterminetherelevantsymbolictermstorepresentanewobservation.

2)Theinferenceprocesstogeneratelinguisticdescriptionsforanunknownobservationwaspresentedintwophases:Todetermineassociatedconceptsandqualitydimensionsofanunknowninstance,itslocationwithintheconstructedconceptualspacehasbeeninvestigated.Thisdeterminationhasbeendonebyconsideringtheinclusionoftheinstanceintheregionsofthespaceandtheuseofsimilaritymeasuresinsuchspace.Afterthat,thelexicalisationoftheinstancehasbeeninducedbyextractingthesemanticlabelsoftheassociatedconceptsandqualitydimensions.Finally,microplanningandrealisationtechniqueshavebeenappliedinordertogeneratenaturallanguagedescriptions.

Oneadvantageofsuchinferencemodelisthattheproposedapproachcon-cernswhichfeaturesandconceptsshouldbeinferredasthemostsuitablesetofinterpretablecontentstodescribeanunknownobservation.

10.1. SUMMARY OF CONTRIBUTIONS 155

10.1.3 Mining Prototypical Patterns and Temporal Rules in

Physiological Sensor Data (C3)

From the application point of view, this thesis has focused on the field of health-care monitoring to analyse the physiological sensor data. The data analysisphase has addressed the task of mining partial trends, prototypical patterns anddistinctive temporal rules using data-driven approaches. The following contri-butions have addressed the task of mining physiological sensor data:

1) A literature review on mining physiological sensor data was presented.This review has considered several health monitoring systems that use wearablesensors to monitor the vital signs. Different data mining tasks for healthcaresystems have been studied, as well as various machine learning techniques toaddress such tasks (Chapter 6).

2) This thesis has introduced an unsupervised approach to extract proto-typical patterns from a physiological time series data. This approach has beendesigned based on desensitising the time series and clustering the sub-sequencesof data to exploit the final patterns. Besides, a partial trend detection methodhas also been proposed to capture the partial behaviours of a time series con-cerning the shape and trend of the data (Chapter 7).

3) The central data analysis part of this work was the process of miningtemporal rules from various channels of sensor data in clinical conditions. Thisprocess has introduced a new temporal rule mining method to extract the re-peated co-occurrences of the physiological patterns in a set of long recordedsensor data. The significant output of such temporal rule mining was the factthat the extracted rules from the data of each clinical condition are distinguish-able from the temporal rules that are derived from other conditions. A newapproach proposed to compare temporal rules has confirmed this uniquenessof rules in each condition (Chapter 8).

The key outcome of the data analysis phase in this thesis is that the entireprocess relies on the measured data itself to express the valuable information.The proposed data-driven approaches have shown that there are some aspectsof the sensor data (i.e., unseen patterns and temporal rules) that are not knownor not readily observable by the domain experts, but they are still interesting tobe extracted and moreover are valuable to be interpreted.

10.1.4 Linguistic Description of Time Series Patterns using

Semantic Representations (C4)

After data analysis phase, this thesis considered the task of representing numer-ical information into linguistic descriptions. This representation has been doneusing both template-based approaches and the proposed semantic representa-tion based on data-driven conceptual spaces. The following contributions haveaddressed the task of describing physiological sensor data:

10.1.SUMMARYOFCONTRIBUTIONS155

10.1.3MiningPrototypicalPatternsandTemporalRulesin

PhysiologicalSensorData(C3)

Fromtheapplicationpointofview,thisthesishasfocusedonthefieldofhealth-caremonitoringtoanalysethephysiologicalsensordata.Thedataanalysisphasehasaddressedthetaskofminingpartialtrends,prototypicalpatternsanddistinctivetemporalrulesusingdata-drivenapproaches.Thefollowingcontri-butionshaveaddressedthetaskofminingphysiologicalsensordata:

1)Aliteraturereviewonminingphysiologicalsensordatawaspresented.Thisreviewhasconsideredseveralhealthmonitoringsystemsthatusewearablesensorstomonitorthevitalsigns.Differentdataminingtasksforhealthcaresystemshavebeenstudied,aswellasvariousmachinelearningtechniquestoaddresssuchtasks(Chapter6).

2)Thisthesishasintroducedanunsupervisedapproachtoextractproto-typicalpatternsfromaphysiologicaltimeseriesdata.Thisapproachhasbeendesignedbasedondesensitisingthetimeseriesandclusteringthesub-sequencesofdatatoexploitthefinalpatterns.Besides,apartialtrenddetectionmethodhasalsobeenproposedtocapturethepartialbehavioursofatimeseriescon-cerningtheshapeandtrendofthedata(Chapter7).

3)Thecentraldataanalysispartofthisworkwastheprocessofminingtemporalrulesfromvariouschannelsofsensordatainclinicalconditions.Thisprocesshasintroducedanewtemporalruleminingmethodtoextractthere-peatedco-occurrencesofthephysiologicalpatternsinasetoflongrecordedsensordata.Thesignificantoutputofsuchtemporalruleminingwasthefactthattheextractedrulesfromthedataofeachclinicalconditionaredistinguish-ablefromthetemporalrulesthatarederivedfromotherconditions.Anewapproachproposedtocomparetemporalruleshasconfirmedthisuniquenessofrulesineachcondition(Chapter8).

Thekeyoutcomeofthedataanalysisphaseinthisthesisisthattheentireprocessreliesonthemeasureddataitselftoexpressthevaluableinformation.Theproposeddata-drivenapproacheshaveshownthattherearesomeaspectsofthesensordata(i.e.,unseenpatternsandtemporalrules)thatarenotknownornotreadilyobservablebythedomainexperts,buttheyarestillinterestingtobeextractedandmoreoverarevaluabletobeinterpreted.

10.1.4LinguisticDescriptionofTimeSeriesPatternsusing

SemanticRepresentations(C4)

Afterdataanalysisphase,thisthesisconsideredthetaskofrepresentingnumer-icalinformationintolinguisticdescriptions.Thisrepresentationhasbeendoneusingbothtemplate-basedapproachesandtheproposedsemanticrepresenta-tionbasedondata-drivenconceptualspaces.Thefollowingcontributionshaveaddressedthetaskofdescribingphysiologicalsensordata:

10.1. SUMMARY OF CONTRIBUTIONS 155

10.1.3 Mining Prototypical Patterns and Temporal Rules in

Physiological Sensor Data (C3)

From the application point of view, this thesis has focused on the field of health-care monitoring to analyse the physiological sensor data. The data analysisphase has addressed the task of mining partial trends, prototypical patterns anddistinctive temporal rules using data-driven approaches. The following contri-butions have addressed the task of mining physiological sensor data:

1) A literature review on mining physiological sensor data was presented.This review has considered several health monitoring systems that use wearablesensors to monitor the vital signs. Different data mining tasks for healthcaresystems have been studied, as well as various machine learning techniques toaddress such tasks (Chapter 6).

2) This thesis has introduced an unsupervised approach to extract proto-typical patterns from a physiological time series data. This approach has beendesigned based on desensitising the time series and clustering the sub-sequencesof data to exploit the final patterns. Besides, a partial trend detection methodhas also been proposed to capture the partial behaviours of a time series con-cerning the shape and trend of the data (Chapter 7).

3) The central data analysis part of this work was the process of miningtemporal rules from various channels of sensor data in clinical conditions. Thisprocess has introduced a new temporal rule mining method to extract the re-peated co-occurrences of the physiological patterns in a set of long recordedsensor data. The significant output of such temporal rule mining was the factthat the extracted rules from the data of each clinical condition are distinguish-able from the temporal rules that are derived from other conditions. A newapproach proposed to compare temporal rules has confirmed this uniquenessof rules in each condition (Chapter 8).

The key outcome of the data analysis phase in this thesis is that the entireprocess relies on the measured data itself to express the valuable information.The proposed data-driven approaches have shown that there are some aspectsof the sensor data (i.e., unseen patterns and temporal rules) that are not knownor not readily observable by the domain experts, but they are still interesting tobe extracted and moreover are valuable to be interpreted.

10.1.4 Linguistic Description of Time Series Patterns using

Semantic Representations (C4)

After data analysis phase, this thesis considered the task of representing numer-ical information into linguistic descriptions. This representation has been doneusing both template-based approaches and the proposed semantic representa-tion based on data-driven conceptual spaces. The following contributions haveaddressed the task of describing physiological sensor data:

10.1.SUMMARYOFCONTRIBUTIONS155

10.1.3MiningPrototypicalPatternsandTemporalRulesin

PhysiologicalSensorData(C3)

Fromtheapplicationpointofview,thisthesishasfocusedonthefieldofhealth-caremonitoringtoanalysethephysiologicalsensordata.Thedataanalysisphasehasaddressedthetaskofminingpartialtrends,prototypicalpatternsanddistinctivetemporalrulesusingdata-drivenapproaches.Thefollowingcontri-butionshaveaddressedthetaskofminingphysiologicalsensordata:

1)Aliteraturereviewonminingphysiologicalsensordatawaspresented.Thisreviewhasconsideredseveralhealthmonitoringsystemsthatusewearablesensorstomonitorthevitalsigns.Differentdataminingtasksforhealthcaresystemshavebeenstudied,aswellasvariousmachinelearningtechniquestoaddresssuchtasks(Chapter6).

2)Thisthesishasintroducedanunsupervisedapproachtoextractproto-typicalpatternsfromaphysiologicaltimeseriesdata.Thisapproachhasbeendesignedbasedondesensitisingthetimeseriesandclusteringthesub-sequencesofdatatoexploitthefinalpatterns.Besides,apartialtrenddetectionmethodhasalsobeenproposedtocapturethepartialbehavioursofatimeseriescon-cerningtheshapeandtrendofthedata(Chapter7).

3)Thecentraldataanalysispartofthisworkwastheprocessofminingtemporalrulesfromvariouschannelsofsensordatainclinicalconditions.Thisprocesshasintroducedanewtemporalruleminingmethodtoextractthere-peatedco-occurrencesofthephysiologicalpatternsinasetoflongrecordedsensordata.Thesignificantoutputofsuchtemporalruleminingwasthefactthattheextractedrulesfromthedataofeachclinicalconditionaredistinguish-ablefromthetemporalrulesthatarederivedfromotherconditions.Anewapproachproposedtocomparetemporalruleshasconfirmedthisuniquenessofrulesineachcondition(Chapter8).

Thekeyoutcomeofthedataanalysisphaseinthisthesisisthattheentireprocessreliesonthemeasureddataitselftoexpressthevaluableinformation.Theproposeddata-drivenapproacheshaveshownthattherearesomeaspectsofthesensordata(i.e.,unseenpatternsandtemporalrules)thatarenotknownornotreadilyobservablebythedomainexperts,buttheyarestillinterestingtobeextractedandmoreoverarevaluabletobeinterpreted.

10.1.4LinguisticDescriptionofTimeSeriesPatternsusing

SemanticRepresentations(C4)

Afterdataanalysisphase,thisthesisconsideredthetaskofrepresentingnumer-icalinformationintolinguisticdescriptions.Thisrepresentationhasbeendoneusingbothtemplate-basedapproachesandtheproposedsemanticrepresenta-tionbasedondata-drivenconceptualspaces.Thefollowingcontributionshaveaddressedthetaskofdescribingphysiologicalsensordata:


1) A template-based linguistic description approach was presented to turnthe partial trends and patterns into a set of natural language terms. Also, forthe extracted temporal rules, this thesis has applied a fuzzy mapping methodto annotate the co-occurrence of the patterns and also the frequency of theirhappening using linguistic terms, and then it has generated natural languagesentences for the rules in each clinical condition (Chapter 8).

2) Applying the proposed semantic representation using data-driven con-ceptual spaces was one of the leading contributions for describing time seriespatterns. The proposed approach in the first part of this thesis has been appliedto a collection of processed known time series patterns as input numerical in-formation. Then, using the constructed conceptual space of time series patterns,the inference process has generated linguistic descriptions for a set of unknowntime series patterns. The presented empirical evaluation has shown the good-ness of the generated texts using the proposed semantic over the introducedgenerative and descriptive models (Chapter 9).

The advantage of applying the semantic model to the physiological data isto be able to interpret data-driven extracted patterns that are unknown by def-inition. Involving semantic representation helps the system to not be restrictedto analyse and mine only the pre-defined patterns requested by the domain ex-pert, but to search for any interesting information and be sure that the semanticmodel can generate a description for such information.

10.2 Limitations

The approaches proposed in both parts of this thesis have presented data-drivenstrategies (numerically, and semantically) for linking numerical information tolinguistic descriptions. The discussions given at the end of each chapter haveshown the key points and critical issues of the contributions. In this section, alist of more general issues and limitations related to the accomplished contri-butions are discussed.

Construction of conceptual spaces Regarding the proposed approach for con-struction of conceptual spaces, one limitation is the assumption of having se-mantic features as inputs. These features are assumed to be understandable orbe interpretable by the human, which means they can be used as semantic termsin the final linguistic descriptions. Besides, another assumption for the input ofconstructing conceptual spaces is the set of labelled or annotated observationswith known classes or concepts. With this assumption, the process of conceptforming can be categorised as a supervised modelling, since the labelled dataleads the process of the domain specification and concept representation.1

1Note that regardless of being supervised or not, still the approach is data-driven in a sense thatthere is no extra knowledge rather than the known input instances to influence the processes ofconstructing the space.


1)Atemplate-basedlinguisticdescriptionapproachwaspresentedtoturnthepartialtrendsandpatternsintoasetofnaturallanguageterms.Also,fortheextractedtemporalrules,thisthesishasappliedafuzzymappingmethodtoannotatetheco-occurrenceofthepatternsandalsothefrequencyoftheirhappeningusinglinguisticterms,andthenithasgeneratednaturallanguagesentencesfortherulesineachclinicalcondition(Chapter8).

2)Applyingtheproposedsemanticrepresentationusingdata-drivencon-ceptualspaceswasoneoftheleadingcontributionsfordescribingtimeseriespatterns.Theproposedapproachinthefirstpartofthisthesishasbeenappliedtoacollectionofprocessedknowntimeseriespatternsasinputnumericalin-formation.Then,usingtheconstructedconceptualspaceoftimeseriespatterns,theinferenceprocesshasgeneratedlinguisticdescriptionsforasetofunknowntimeseriespatterns.Thepresentedempiricalevaluationhasshownthegood-nessofthegeneratedtextsusingtheproposedsemanticovertheintroducedgenerativeanddescriptivemodels(Chapter9).

Theadvantageofapplyingthesemanticmodeltothephysiologicaldataistobeabletointerpretdata-drivenextractedpatternsthatareunknownbydef-inition.Involvingsemanticrepresentationhelpsthesystemtonotberestrictedtoanalyseandmineonlythepre-definedpatternsrequestedbythedomainex-pert,buttosearchforanyinterestinginformationandbesurethatthesemanticmodelcangenerateadescriptionforsuchinformation.

10.2Limitations

Theapproachesproposedinbothpartsofthisthesishavepresenteddata-drivenstrategies(numerically,andsemantically)forlinkingnumericalinformationtolinguisticdescriptions.Thediscussionsgivenattheendofeachchapterhaveshownthekeypointsandcriticalissuesofthecontributions.Inthissection,alistofmoregeneralissuesandlimitationsrelatedtotheaccomplishedcontri-butionsarediscussed.

ConstructionofconceptualspacesRegardingtheproposedapproachforcon-structionofconceptualspaces,onelimitationistheassumptionofhavingse-manticfeaturesasinputs.Thesefeaturesareassumedtobeunderstandableorbeinterpretablebythehuman,whichmeanstheycanbeusedassemantictermsinthefinallinguisticdescriptions.Besides,anotherassumptionfortheinputofconstructingconceptualspacesisthesetoflabelledorannotatedobservationswithknownclassesorconcepts.Withthisassumption,theprocessofconceptformingcanbecategorisedasasupervisedmodelling,sincethelabelleddataleadstheprocessofthedomainspecificationandconceptrepresentation.1

1Notethatregardlessofbeingsupervisedornot,stilltheapproachisdata-driveninasensethatthereisnoextraknowledgeratherthantheknowninputinstancestoinfluencetheprocessesofconstructingthespace.


1) A template-based linguistic description approach was presented to turnthe partial trends and patterns into a set of natural language terms. Also, forthe extracted temporal rules, this thesis has applied a fuzzy mapping methodto annotate the co-occurrence of the patterns and also the frequency of theirhappening using linguistic terms, and then it has generated natural languagesentences for the rules in each clinical condition (Chapter 8).

2) Applying the proposed semantic representation using data-driven con-ceptual spaces was one of the leading contributions for describing time seriespatterns. The proposed approach in the first part of this thesis has been appliedto a collection of processed known time series patterns as input numerical in-formation. Then, using the constructed conceptual space of time series patterns,the inference process has generated linguistic descriptions for a set of unknowntime series patterns. The presented empirical evaluation has shown the good-ness of the generated texts using the proposed semantic over the introducedgenerative and descriptive models (Chapter 9).

The advantage of applying the semantic model to the physiological data isto be able to interpret data-driven extracted patterns that are unknown by def-inition. Involving semantic representation helps the system to not be restrictedto analyse and mine only the pre-defined patterns requested by the domain ex-pert, but to search for any interesting information and be sure that the semanticmodel can generate a description for such information.

10.2 Limitations

The approaches proposed in both parts of this thesis have presented data-drivenstrategies (numerically, and semantically) for linking numerical information tolinguistic descriptions. The discussions given at the end of each chapter haveshown the key points and critical issues of the contributions. In this section, alist of more general issues and limitations related to the accomplished contri-butions are discussed.

Construction of conceptual spaces Regarding the proposed approach for con-struction of conceptual spaces, one limitation is the assumption of having se-mantic features as inputs. These features are assumed to be understandable orbe interpretable by the human, which means they can be used as semantic termsin the final linguistic descriptions. Besides, another assumption for the input ofconstructing conceptual spaces is the set of labelled or annotated observationswith known classes or concepts. With this assumption, the process of conceptforming can be categorised as a supervised modelling, since the labelled dataleads the process of the domain specification and concept representation.1

1Note that regardless of being supervised or not, still the approach is data-driven in a sense thatthere is no extra knowledge rather than the known input instances to influence the processes ofconstructing the space.


1)Atemplate-basedlinguisticdescriptionapproachwaspresentedtoturnthepartialtrendsandpatternsintoasetofnaturallanguageterms.Also,fortheextractedtemporalrules,thisthesishasappliedafuzzymappingmethodtoannotatetheco-occurrenceofthepatternsandalsothefrequencyoftheirhappeningusinglinguisticterms,andthenithasgeneratednaturallanguagesentencesfortherulesineachclinicalcondition(Chapter8).

2)Applyingtheproposedsemanticrepresentationusingdata-drivencon-ceptualspaceswasoneoftheleadingcontributionsfordescribingtimeseriespatterns.Theproposedapproachinthefirstpartofthisthesishasbeenappliedtoacollectionofprocessedknowntimeseriespatternsasinputnumericalin-formation.Then,usingtheconstructedconceptualspaceoftimeseriespatterns,theinferenceprocesshasgeneratedlinguisticdescriptionsforasetofunknowntimeseriespatterns.Thepresentedempiricalevaluationhasshownthegood-nessofthegeneratedtextsusingtheproposedsemanticovertheintroducedgenerativeanddescriptivemodels(Chapter9).

Theadvantageofapplyingthesemanticmodeltothephysiologicaldataistobeabletointerpretdata-drivenextractedpatternsthatareunknownbydef-inition.Involvingsemanticrepresentationhelpsthesystemtonotberestrictedtoanalyseandmineonlythepre-definedpatternsrequestedbythedomainex-pert,buttosearchforanyinterestinginformationandbesurethatthesemanticmodelcangenerateadescriptionforsuchinformation.

10.2Limitations

Theapproachesproposedinbothpartsofthisthesishavepresenteddata-drivenstrategies(numerically,andsemantically)forlinkingnumericalinformationtolinguisticdescriptions.Thediscussionsgivenattheendofeachchapterhaveshownthekeypointsandcriticalissuesofthecontributions.Inthissection,alistofmoregeneralissuesandlimitationsrelatedtotheaccomplishedcontri-butionsarediscussed.

ConstructionofconceptualspacesRegardingtheproposedapproachforcon-structionofconceptualspaces,onelimitationistheassumptionofhavingse-manticfeaturesasinputs.Thesefeaturesareassumedtobeunderstandableorbeinterpretablebythehuman,whichmeanstheycanbeusedassemantictermsinthefinallinguisticdescriptions.Besides,anotherassumptionfortheinputofconstructingconceptualspacesisthesetoflabelledorannotatedobservationswithknownclassesorconcepts.Withthisassumption,theprocessofconceptformingcanbecategorisedasasupervisedmodelling,sincethelabelleddataleadstheprocessofthedomainspecificationandconceptrepresentation.1

1Notethatregardlessofbeingsupervisedornot,stilltheapproachisdata-driveninasensethatthereisnoextraknowledgeratherthantheknowninputinstancestoinfluencetheprocessesofconstructingthespace.

10.2. LIMITATIONS 157

Another limitation of constructing conceptual spaces is related to the pro-posed algorithms. The algorithms for the domain and quality dimension spec-ification have been introduced heuristic approaches to filter and to group thefeatures, which are not the optimal ways to do so. Therefore, there is a lackof studying on how to measure the goodness of the specified domains in com-parison with other non-greedy approaches. It is worth mentioning that thesealgorithms have not aimed to perform classification task to discriminate theclasses of data with high accuracy. Instead, they have attempted to select themost distinctive features for each class of data to enrich the descriptivity of themodel by presenting multi-domain space. Therefore, it might be meaninglessto apply classification measures like recall or precision in order to measure thegoodness of the model.

Semantic inference in conceptual spaces: Regarding the semantic inference inconceptual spaces, the structure of the introduced symbol space is dependenton domain knowledge, which is the set of linguistic annotations and symbolsof the input features and concepts. An important point to mention is that thereis no inference to exploit or generate a new semantic label or term by reasoningamong the relation of the provided symbols or words based on any knowledge-based system (e.g., ontologies). Instead, the inference process chooses the mostrepresentative terms and labels for a new unknown observation among all thealready provided information.

Another constraint in the inference process is related to the unknown obser-vations as the inputs. The unknown observations should have the same proper-ties that the known observations have. More precisely, the unknown observa-tions should be able to be vectorised within the constructed conceptual space.Thus, any unknown observation with missing data or out-of-range values willbe problematic in the current version of the proposed inference approach.

The linguistic description task in the inference process has also some limita-tions. The main one is that the semantic role of the concepts, sub-concepts, andquality dimensions are reduced or simplified to some specific linguistic roles in asentence. The concepts are restricted to be the nouns and sub-concepts or prop-erties are constrained to be the adjectives in the final realisation task. Regardingthe quality of the inferred descriptions, one limitation is that the generated textis subject to questions related to some criteria such as conciseness versus ver-bosity of the descriptions, the target audience of the system, and the ultimateaim the text is expected to support.

Case study and evaluation: Regarding the evaluation of the semantic represen-tation approach, it is not trivial to find a general solution to evaluate each com-ponent or step of the approach, separately. For the construction part, there isnot yet a measure to compare the constructed conceptual spaces in term of ap-plicability or sufficiency. Moreover, the inference process within the conceptualspace cannot be isolated and be evaluated individually. Therefore, this thesis

10.2.LIMITATIONS157

Anotherlimitationofconstructingconceptualspacesisrelatedtothepro-posedalgorithms.Thealgorithmsforthedomainandqualitydimensionspec-ificationhavebeenintroducedheuristicapproachestofilterandtogroupthefeatures,whicharenottheoptimalwaystodoso.Therefore,thereisalackofstudyingonhowtomeasurethegoodnessofthespecifieddomainsincom-parisonwithothernon-greedyapproaches.Itisworthmentioningthatthesealgorithmshavenotaimedtoperformclassificationtasktodiscriminatetheclassesofdatawithhighaccuracy.Instead,theyhaveattemptedtoselectthemostdistinctivefeaturesforeachclassofdatatoenrichthedescriptivityofthemodelbypresentingmulti-domainspace.Therefore,itmightbemeaninglesstoapplyclassificationmeasureslikerecallorprecisioninordertomeasurethegoodnessofthemodel.

Semanticinferenceinconceptualspaces:Regardingthesemanticinferenceinconceptualspaces,thestructureoftheintroducedsymbolspaceisdependentondomainknowledge,whichisthesetoflinguisticannotationsandsymbolsoftheinputfeaturesandconcepts.Animportantpointtomentionisthatthereisnoinferencetoexploitorgenerateanewsemanticlabelortermbyreasoningamongtherelationoftheprovidedsymbolsorwordsbasedonanyknowledge-basedsystem(e.g.,ontologies).Instead,theinferenceprocesschoosesthemostrepresentativetermsandlabelsforanewunknownobservationamongallthealreadyprovidedinformation.

Anotherconstraintintheinferenceprocessisrelatedtotheunknownobser-vationsastheinputs.Theunknownobservationsshouldhavethesameproper-tiesthattheknownobservationshave.Moreprecisely,theunknownobserva-tionsshouldbeabletobevectorisedwithintheconstructedconceptualspace.Thus,anyunknownobservationwithmissingdataorout-of-rangevalueswillbeproblematicinthecurrentversionoftheproposedinferenceapproach.

Thelinguisticdescriptiontaskintheinferenceprocesshasalsosomelimita-tions.Themainoneisthatthesemanticroleoftheconcepts,sub-concepts,andqualitydimensionsarereducedorsimplifiedtosomespecificlinguisticrolesinasentence.Theconceptsarerestrictedtobethenounsandsub-conceptsorprop-ertiesareconstrainedtobetheadjectivesinthefinalrealisationtask.Regardingthequalityoftheinferreddescriptions,onelimitationisthatthegeneratedtextissubjecttoquestionsrelatedtosomecriteriasuchasconcisenessversusver-bosityofthedescriptions,thetargetaudienceofthesystem,andtheultimateaimthetextisexpectedtosupport.

Casestudyandevaluation:Regardingtheevaluationofthesemanticrepresen-tationapproach,itisnottrivialtofindageneralsolutiontoevaluateeachcom-ponentorstepoftheapproach,separately.Fortheconstructionpart,thereisnotyetameasuretocomparetheconstructedconceptualspacesintermofap-plicabilityorsufficiency.Moreover,theinferenceprocesswithintheconceptualspacecannotbeisolatedandbeevaluatedindividually.Therefore,thisthesis

10.2. LIMITATIONS 157

Another limitation of constructing conceptual spaces is related to the pro-posed algorithms. The algorithms for the domain and quality dimension spec-ification have been introduced heuristic approaches to filter and to group thefeatures, which are not the optimal ways to do so. Therefore, there is a lackof studying on how to measure the goodness of the specified domains in com-parison with other non-greedy approaches. It is worth mentioning that thesealgorithms have not aimed to perform classification task to discriminate theclasses of data with high accuracy. Instead, they have attempted to select themost distinctive features for each class of data to enrich the descriptivity of themodel by presenting multi-domain space. Therefore, it might be meaninglessto apply classification measures like recall or precision in order to measure thegoodness of the model.

Semantic inference in conceptual spaces: Regarding the semantic inference inconceptual spaces, the structure of the introduced symbol space is dependenton domain knowledge, which is the set of linguistic annotations and symbolsof the input features and concepts. An important point to mention is that thereis no inference to exploit or generate a new semantic label or term by reasoningamong the relation of the provided symbols or words based on any knowledge-based system (e.g., ontologies). Instead, the inference process chooses the mostrepresentative terms and labels for a new unknown observation among all thealready provided information.

Another constraint in the inference process is related to the unknown obser-vations as the inputs. The unknown observations should have the same proper-ties that the known observations have. More precisely, the unknown observa-tions should be able to be vectorised within the constructed conceptual space.Thus, any unknown observation with missing data or out-of-range values willbe problematic in the current version of the proposed inference approach.

The linguistic description task in the inference process has also some limita-tions. The main one is that the semantic role of the concepts, sub-concepts, andquality dimensions are reduced or simplified to some specific linguistic roles in asentence. The concepts are restricted to be the nouns and sub-concepts or prop-erties are constrained to be the adjectives in the final realisation task. Regardingthe quality of the inferred descriptions, one limitation is that the generated textis subject to questions related to some criteria such as conciseness versus ver-bosity of the descriptions, the target audience of the system, and the ultimateaim the text is expected to support.

Case study and evaluation: Regarding the evaluation of the semantic represen-tation approach, it is not trivial to find a general solution to evaluate each com-ponent or step of the approach, separately. For the construction part, there isnot yet a measure to compare the constructed conceptual spaces in term of ap-plicability or sufficiency. Moreover, the inference process within the conceptualspace cannot be isolated and be evaluated individually. Therefore, this thesis

10.2.LIMITATIONS157

Anotherlimitationofconstructingconceptualspacesisrelatedtothepro-posedalgorithms.Thealgorithmsforthedomainandqualitydimensionspec-ificationhavebeenintroducedheuristicapproachestofilterandtogroupthefeatures,whicharenottheoptimalwaystodoso.Therefore,thereisalackofstudyingonhowtomeasurethegoodnessofthespecifieddomainsincom-parisonwithothernon-greedyapproaches.Itisworthmentioningthatthesealgorithmshavenotaimedtoperformclassificationtasktodiscriminatetheclassesofdatawithhighaccuracy.Instead,theyhaveattemptedtoselectthemostdistinctivefeaturesforeachclassofdatatoenrichthedescriptivityofthemodelbypresentingmulti-domainspace.Therefore,itmightbemeaninglesstoapplyclassificationmeasureslikerecallorprecisioninordertomeasurethegoodnessofthemodel.

Semanticinferenceinconceptualspaces:Regardingthesemanticinferenceinconceptualspaces,thestructureoftheintroducedsymbolspaceisdependentondomainknowledge,whichisthesetoflinguisticannotationsandsymbolsoftheinputfeaturesandconcepts.Animportantpointtomentionisthatthereisnoinferencetoexploitorgenerateanewsemanticlabelortermbyreasoningamongtherelationoftheprovidedsymbolsorwordsbasedonanyknowledge-basedsystem(e.g.,ontologies).Instead,theinferenceprocesschoosesthemostrepresentativetermsandlabelsforanewunknownobservationamongallthealreadyprovidedinformation.

Anotherconstraintintheinferenceprocessisrelatedtotheunknownobser-vationsastheinputs.Theunknownobservationsshouldhavethesameproper-tiesthattheknownobservationshave.Moreprecisely,theunknownobserva-tionsshouldbeabletobevectorisedwithintheconstructedconceptualspace.Thus,anyunknownobservationwithmissingdataorout-of-rangevalueswillbeproblematicinthecurrentversionoftheproposedinferenceapproach.

Thelinguisticdescriptiontaskintheinferenceprocesshasalsosomelimita-tions.Themainoneisthatthesemanticroleoftheconcepts,sub-concepts,andqualitydimensionsarereducedorsimplifiedtosomespecificlinguisticrolesinasentence.Theconceptsarerestrictedtobethenounsandsub-conceptsorprop-ertiesareconstrainedtobetheadjectivesinthefinalrealisationtask.Regardingthequalityoftheinferreddescriptions,onelimitationisthatthegeneratedtextissubjecttoquestionsrelatedtosomecriteriasuchasconcisenessversusver-bosityofthedescriptions,thetargetaudienceofthesystem,andtheultimateaimthetextisexpectedtosupport.

Casestudyandevaluation:Regardingtheevaluationofthesemanticrepresen-tationapproach,itisnottrivialtofindageneralsolutiontoevaluateeachcom-ponentorstepoftheapproach,separately.Fortheconstructionpart,thereisnotyetameasuretocomparetheconstructedconceptualspacesintermofap-plicabilityorsufficiency.Moreover,theinferenceprocesswithintheconceptualspacecannotbeisolatedandbeevaluatedindividually.Therefore,thisthesis


has assessed the final output of the inference step in order to evaluate the good-ness of multi-domain model in comparison with the single-domain models. Thiscomparison accesses the idea of constructing multi-domain conceptual spacesout of numerical data, but not the individual components of the constructionprocess.

Mining physiological sensor data: There are some limitations related to dataanalysis of physiological data. First of all, the source of data itself is a chal-lenge. There are few available data sets including the vital signs recorded fora long time from different subjects. This limitation has made the constraint toonly test the approach on open access data sets, rather than collecting data viawearable sensors (e.g., within a controlled environment). The main reason torely on long-term data is that the proposed algorithms for pattern abstractionand temporal rule mining will return meaningful information when the input isa long stream of the sensor data with minimum interruptions or noises.

Regarding the temporal rule mining, the presented approach has consid-ered the frequency of the co-occurrence of the patterns throughout the differentchannels of data. The current approach has only provided rules which are tem-poral correlations of the patterns from the co-occurrence aspects. Thus, there isa lack of considering other ways of analysing data to capture more meaningfulinformation, such as the causes and effects of the patterns.

Linguistic description of physiological patterns: The main limitation of the pro-posed approach for describing time series patterns is the richness of the pro-vided features. In general. there is a limited number of semantic features, repre-senting the behaviour of the time series patterns, in the literature. In addition,most of the features to analyse the time series are complicated to be interpretedor are not relevant to the context. For example, wavelet coefficients are theuseful features for numerical time series analysis, but they are complex to betranslated into understandable natural language terms. Some features like thearea under the signal might be explainable but are not interpretable or mean-ingful when e.g., the shape and the trend of the time series are in the focus. So,the proposed conceptual space of the patterns is limited to a small number ofinput features which are more or less understandable by the human subjects toshow the behaviour of the time series.

10.3 Societal and Ethical Impacts

Nowadays sensors are everywhere to collect various kinds of information. Un-derstanding and interpreting this information is a significant challenge, espe-cially when the information is related to crucial aspects of people’s life. Thisthesis has provided a system to process and describe unknown observations de-rived from sensor data. Within a society, this research might be applicable indifferent scenarios. In any situation that the perceived information is not explic-


hasassessedthefinaloutputoftheinferencestepinordertoevaluatethegood-nessofmulti-domainmodelincomparisonwiththesingle-domainmodels.Thiscomparisonaccessestheideaofconstructingmulti-domainconceptualspacesoutofnumericaldata,butnottheindividualcomponentsoftheconstructionprocess.

Miningphysiologicalsensordata:Therearesomelimitationsrelatedtodataanalysisofphysiologicaldata.Firstofall,thesourceofdataitselfisachal-lenge.Therearefewavailabledatasetsincludingthevitalsignsrecordedforalongtimefromdifferentsubjects.Thislimitationhasmadetheconstrainttoonlytesttheapproachonopenaccessdatasets,ratherthancollectingdataviawearablesensors(e.g.,withinacontrolledenvironment).Themainreasontorelyonlong-termdataisthattheproposedalgorithmsforpatternabstractionandtemporalruleminingwillreturnmeaningfulinformationwhentheinputisalongstreamofthesensordatawithminimuminterruptionsornoises.

Regardingthetemporalrulemining,thepresentedapproachhasconsid-eredthefrequencyoftheco-occurrenceofthepatternsthroughoutthedifferentchannelsofdata.Thecurrentapproachhasonlyprovidedruleswhicharetem-poralcorrelationsofthepatternsfromtheco-occurrenceaspects.Thus,thereisalackofconsideringotherwaysofanalysingdatatocapturemoremeaningfulinformation,suchasthecausesandeffectsofthepatterns.

Linguisticdescriptionofphysiologicalpatterns:Themainlimitationofthepro-posedapproachfordescribingtimeseriespatternsistherichnessofthepro-videdfeatures.Ingeneral.thereisalimitednumberofsemanticfeatures,repre-sentingthebehaviourofthetimeseriespatterns,intheliterature.Inaddition,mostofthefeaturestoanalysethetimeseriesarecomplicatedtobeinterpretedorarenotrelevanttothecontext.Forexample,waveletcoefficientsaretheusefulfeaturesfornumericaltimeseriesanalysis,buttheyarecomplextobetranslatedintounderstandablenaturallanguageterms.Somefeaturesliketheareaunderthesignalmightbeexplainablebutarenotinterpretableormean-ingfulwhene.g.,theshapeandthetrendofthetimeseriesareinthefocus.So,theproposedconceptualspaceofthepatternsislimitedtoasmallnumberofinputfeatureswhicharemoreorlessunderstandablebythehumansubjectstoshowthebehaviourofthetimeseries.

10.3SocietalandEthicalImpacts

Nowadayssensorsareeverywheretocollectvariouskindsofinformation.Un-derstandingandinterpretingthisinformationisasignificantchallenge,espe-ciallywhentheinformationisrelatedtocrucialaspectsofpeople’slife.Thisthesishasprovidedasystemtoprocessanddescribeunknownobservationsde-rivedfromsensordata.Withinasociety,thisresearchmightbeapplicableindifferentscenarios.Inanysituationthattheperceivedinformationisnotexplic-


has assessed the final output of the inference step in order to evaluate the good-ness of multi-domain model in comparison with the single-domain models. Thiscomparison accesses the idea of constructing multi-domain conceptual spacesout of numerical data, but not the individual components of the constructionprocess.

Mining physiological sensor data: There are some limitations related to dataanalysis of physiological data. First of all, the source of data itself is a chal-lenge. There are few available data sets including the vital signs recorded fora long time from different subjects. This limitation has made the constraint toonly test the approach on open access data sets, rather than collecting data viawearable sensors (e.g., within a controlled environment). The main reason torely on long-term data is that the proposed algorithms for pattern abstractionand temporal rule mining will return meaningful information when the input isa long stream of the sensor data with minimum interruptions or noises.

Regarding the temporal rule mining, the presented approach has consid-ered the frequency of the co-occurrence of the patterns throughout the differentchannels of data. The current approach has only provided rules which are tem-poral correlations of the patterns from the co-occurrence aspects. Thus, there isa lack of considering other ways of analysing data to capture more meaningfulinformation, such as the causes and effects of the patterns.

Linguistic description of physiological patterns: The main limitation of the pro-posed approach for describing time series patterns is the richness of the pro-vided features. In general. there is a limited number of semantic features, repre-senting the behaviour of the time series patterns, in the literature. In addition,most of the features to analyse the time series are complicated to be interpretedor are not relevant to the context. For example, wavelet coefficients are theuseful features for numerical time series analysis, but they are complex to betranslated into understandable natural language terms. Some features like thearea under the signal might be explainable but are not interpretable or mean-ingful when e.g., the shape and the trend of the time series are in the focus. So,the proposed conceptual space of the patterns is limited to a small number ofinput features which are more or less understandable by the human subjects toshow the behaviour of the time series.

10.3 Societal and Ethical Impacts

Nowadays sensors are everywhere to collect various kinds of information. Un-derstanding and interpreting this information is a significant challenge, espe-cially when the information is related to crucial aspects of people’s life. Thisthesis has provided a system to process and describe unknown observations de-rived from sensor data. Within a society, this research might be applicable indifferent scenarios. In any situation that the perceived information is not explic-


hasassessedthefinaloutputoftheinferencestepinordertoevaluatethegood-nessofmulti-domainmodelincomparisonwiththesingle-domainmodels.Thiscomparisonaccessestheideaofconstructingmulti-domainconceptualspacesoutofnumericaldata,butnottheindividualcomponentsoftheconstructionprocess.

Miningphysiologicalsensordata:Therearesomelimitationsrelatedtodataanalysisofphysiologicaldata.Firstofall,thesourceofdataitselfisachal-lenge.Therearefewavailabledatasetsincludingthevitalsignsrecordedforalongtimefromdifferentsubjects.Thislimitationhasmadetheconstrainttoonlytesttheapproachonopenaccessdatasets,ratherthancollectingdataviawearablesensors(e.g.,withinacontrolledenvironment).Themainreasontorelyonlong-termdataisthattheproposedalgorithmsforpatternabstractionandtemporalruleminingwillreturnmeaningfulinformationwhentheinputisalongstreamofthesensordatawithminimuminterruptionsornoises.

Regardingthetemporalrulemining,thepresentedapproachhasconsid-eredthefrequencyoftheco-occurrenceofthepatternsthroughoutthedifferentchannelsofdata.Thecurrentapproachhasonlyprovidedruleswhicharetem-poralcorrelationsofthepatternsfromtheco-occurrenceaspects.Thus,thereisalackofconsideringotherwaysofanalysingdatatocapturemoremeaningfulinformation,suchasthecausesandeffectsofthepatterns.

Linguisticdescriptionofphysiologicalpatterns:Themainlimitationofthepro-posedapproachfordescribingtimeseriespatternsistherichnessofthepro-videdfeatures.Ingeneral.thereisalimitednumberofsemanticfeatures,repre-sentingthebehaviourofthetimeseriespatterns,intheliterature.Inaddition,mostofthefeaturestoanalysethetimeseriesarecomplicatedtobeinterpretedorarenotrelevanttothecontext.Forexample,waveletcoefficientsaretheusefulfeaturesfornumericaltimeseriesanalysis,buttheyarecomplextobetranslatedintounderstandablenaturallanguageterms.Somefeaturesliketheareaunderthesignalmightbeexplainablebutarenotinterpretableormean-ingfulwhene.g.,theshapeandthetrendofthetimeseriesareinthefocus.So,theproposedconceptualspaceofthepatternsislimitedtoasmallnumberofinputfeatureswhicharemoreorlessunderstandablebythehumansubjectstoshowthebehaviourofthetimeseries.

10.3SocietalandEthicalImpacts

Nowadayssensorsareeverywheretocollectvariouskindsofinformation.Un-derstandingandinterpretingthisinformationisasignificantchallenge,espe-ciallywhentheinformationisrelatedtocrucialaspectsofpeople’slife.Thisthesishasprovidedasystemtoprocessanddescribeunknownobservationsde-rivedfromsensordata.Withinasociety,thisresearchmightbeapplicableindifferentscenarios.Inanysituationthattheperceivedinformationisnotexplic-

10.4. FUTURE RESEARCH DIRECTIONS 159

itly recognisable, the proposed system can help to interpret those observations.An example of the usage of the system in society could be to help particu-lar groups of people (e.g., children, persons with disabilities) in order to makesense about the observations around them. This goal can be achieved by cou-pling the approach with other perception or/and decision making systems. Theimpact of the proposed approach will be more critical in the medical domainsince it can help the patients of clinicians to see new aspects of the perceivedinformation that have been unknown beforehand.

Besides the societal impacts of this research, several ethical impacts must beconsidered. The ethical issues in this thesis can be seen from three perspectives:input data, proposed approach, and output text. Regarding the input data, theprimary ethical challenge (especially in the medical domain) is the anonymityof the subjects who use the sensors. Systems that use the recorded data fromusers should ensure to protect the privacy of them under regulations such asthe General Data Protection Regulation (GDPR) [2]. In this thesis, all the ac-quired physiological sensor data have been kept anonymous. The ethical issuerelated to the approach is the problem of blindly applying data-driven strate-gies. A potential risk within the data-driven approaches is that the constructedmodels ((e.g., learning or decision making models) can be easily biased if theinput observations are biased. Thus, one might think about which observationsare suitable to be fed to such blind models. Last but not least, ethical issuesare related to the output linguistic descriptions. A generated text for a specificend-user might involve ambiguous, inadequate content, or even might misplacethe provided contents in the sentence. These problems can lead to critical issuessuch as misinterpreting the content of the text, and consequently making inac-curate or incorrect decisions. This thesis has not directly addressed this ethicalissue in its case studies, but it is worth to keep it in mind for further develop-ments upon such a framework.

10.4 Future Research Directions

Besides the limitations of the proposed approaches mentioned above, there isa number of future research directions that require further investigation. Thissection presents the future work regarding each part of the thesis.

Conceptual Spaces: Construction and Inference

One possible direction of the future work is to focus on acquiring semanticfeatures that are contextually understandable by the human and are able todifferentiate distinct concepts in the domain [77, 96, 217]. There has been apreliminary ongoing study on this topic by the author of this thesis, whichneeds to be extended and be evaluated. Two possible approaches can be ap-plied for the task of semantic feature acquisition. One way is to specify theattributes coming from the human perceptions of instances that bridges be-

10.4.FUTURERESEARCHDIRECTIONS159

itlyrecognisable,theproposedsystemcanhelptointerpretthoseobservations.Anexampleoftheusageofthesysteminsocietycouldbetohelpparticu-largroupsofpeople(e.g.,children,personswithdisabilities)inordertomakesenseabouttheobservationsaroundthem.Thisgoalcanbeachievedbycou-plingtheapproachwithotherperceptionor/anddecisionmakingsystems.Theimpactoftheproposedapproachwillbemorecriticalinthemedicaldomainsinceitcanhelpthepatientsofclinicianstoseenewaspectsoftheperceivedinformationthathavebeenunknownbeforehand.

Besidesthesocietalimpactsofthisresearch,severalethicalimpactsmustbeconsidered.Theethicalissuesinthisthesiscanbeseenfromthreeperspectives:inputdata,proposedapproach,andoutputtext.Regardingtheinputdata,theprimaryethicalchallenge(especiallyinthemedicaldomain)istheanonymityofthesubjectswhousethesensors.SystemsthatusetherecordeddatafromusersshouldensuretoprotecttheprivacyofthemunderregulationssuchastheGeneralDataProtectionRegulation(GDPR)[2].Inthisthesis,alltheac-quiredphysiologicalsensordatahavebeenkeptanonymous.Theethicalissuerelatedtotheapproachistheproblemofblindlyapplyingdata-drivenstrate-gies.Apotentialriskwithinthedata-drivenapproachesisthattheconstructedmodels((e.g.,learningordecisionmakingmodels)canbeeasilybiasediftheinputobservationsarebiased.Thus,onemightthinkaboutwhichobservationsaresuitabletobefedtosuchblindmodels.Lastbutnotleast,ethicalissuesarerelatedtotheoutputlinguisticdescriptions.Ageneratedtextforaspecificend-usermightinvolveambiguous,inadequatecontent,orevenmightmisplacetheprovidedcontentsinthesentence.Theseproblemscanleadtocriticalissuessuchasmisinterpretingthecontentofthetext,andconsequentlymakinginac-curateorincorrectdecisions.Thisthesishasnotdirectlyaddressedthisethicalissueinitscasestudies,butitisworthtokeepitinmindforfurtherdevelop-mentsuponsuchaframework.

10.4FutureResearchDirections

Besidesthelimitationsoftheproposedapproachesmentionedabove,thereisanumberoffutureresearchdirectionsthatrequirefurtherinvestigation.Thissectionpresentsthefutureworkregardingeachpartofthethesis.

ConceptualSpaces:ConstructionandInference

Onepossibledirectionofthefutureworkistofocusonacquiringsemanticfeaturesthatarecontextuallyunderstandablebythehumanandareabletodifferentiatedistinctconceptsinthedomain[77,96,217].Therehasbeenapreliminaryongoingstudyonthistopicbytheauthorofthisthesis,whichneedstobeextendedandbeevaluated.Twopossibleapproachescanbeap-pliedforthetaskofsemanticfeatureacquisition.Onewayistospecifytheattributescomingfromthehumanperceptionsofinstancesthatbridgesbe-


itly recognisable, the proposed system can help to interpret those observations.An example of the usage of the system in society could be to help particu-lar groups of people (e.g., children, persons with disabilities) in order to makesense about the observations around them. This goal can be achieved by cou-pling the approach with other perception or/and decision making systems. Theimpact of the proposed approach will be more critical in the medical domainsince it can help the patients of clinicians to see new aspects of the perceivedinformation that have been unknown beforehand.

Besides the societal impacts of this research, several ethical impacts must beconsidered. The ethical issues in this thesis can be seen from three perspectives:input data, proposed approach, and output text. Regarding the input data, theprimary ethical challenge (especially in the medical domain) is the anonymityof the subjects who use the sensors. Systems that use the recorded data fromusers should ensure to protect the privacy of them under regulations such asthe General Data Protection Regulation (GDPR) [2]. In this thesis, all the ac-quired physiological sensor data have been kept anonymous. The ethical issuerelated to the approach is the problem of blindly applying data-driven strate-gies. A potential risk within the data-driven approaches is that the constructedmodels ((e.g., learning or decision making models) can be easily biased if theinput observations are biased. Thus, one might think about which observationsare suitable to be fed to such blind models. Last but not least, ethical issuesare related to the output linguistic descriptions. A generated text for a specificend-user might involve ambiguous, inadequate content, or even might misplacethe provided contents in the sentence. These problems can lead to critical issuessuch as misinterpreting the content of the text, and consequently making inac-curate or incorrect decisions. This thesis has not directly addressed this ethicalissue in its case studies, but it is worth to keep it in mind for further develop-ments upon such a framework.

10.4 Future Research Directions

Besides the limitations of the proposed approaches mentioned above, there isa number of future research directions that require further investigation. Thissection presents the future work regarding each part of the thesis.

Conceptual Spaces: Construction and Inference

One possible direction of the future work is to focus on acquiring semanticfeatures that are contextually understandable by the human and are able todifferentiate distinct concepts in the domain [77, 96, 217]. There has been apreliminary ongoing study on this topic by the author of this thesis, whichneeds to be extended and be evaluated. Two possible approaches can be ap-plied for the task of semantic feature acquisition. One way is to specify theattributes coming from the human perceptions of instances that bridges be-


itlyrecognisable,theproposedsystemcanhelptointerpretthoseobservations.Anexampleoftheusageofthesysteminsocietycouldbetohelpparticu-largroupsofpeople(e.g.,children,personswithdisabilities)inordertomakesenseabouttheobservationsaroundthem.Thisgoalcanbeachievedbycou-plingtheapproachwithotherperceptionor/anddecisionmakingsystems.Theimpactoftheproposedapproachwillbemorecriticalinthemedicaldomainsinceitcanhelpthepatientsofclinicianstoseenewaspectsoftheperceivedinformationthathavebeenunknownbeforehand.

Besidesthesocietalimpactsofthisresearch,severalethicalimpactsmustbeconsidered.Theethicalissuesinthisthesiscanbeseenfromthreeperspectives:inputdata,proposedapproach,andoutputtext.Regardingtheinputdata,theprimaryethicalchallenge(especiallyinthemedicaldomain)istheanonymityofthesubjectswhousethesensors.SystemsthatusetherecordeddatafromusersshouldensuretoprotecttheprivacyofthemunderregulationssuchastheGeneralDataProtectionRegulation(GDPR)[2].Inthisthesis,alltheac-quiredphysiologicalsensordatahavebeenkeptanonymous.Theethicalissuerelatedtotheapproachistheproblemofblindlyapplyingdata-drivenstrate-gies.Apotentialriskwithinthedata-drivenapproachesisthattheconstructedmodels((e.g.,learningordecisionmakingmodels)canbeeasilybiasediftheinputobservationsarebiased.Thus,onemightthinkaboutwhichobservationsaresuitabletobefedtosuchblindmodels.Lastbutnotleast,ethicalissuesarerelatedtotheoutputlinguisticdescriptions.Ageneratedtextforaspecificend-usermightinvolveambiguous,inadequatecontent,orevenmightmisplacetheprovidedcontentsinthesentence.Theseproblemscanleadtocriticalissuessuchasmisinterpretingthecontentofthetext,andconsequentlymakinginac-curateorincorrectdecisions.Thisthesishasnotdirectlyaddressedthisethicalissueinitscasestudies,butitisworthtokeepitinmindforfurtherdevelop-mentsuponsuchaframework.

10.4FutureResearchDirections

Besidesthelimitationsoftheproposedapproachesmentionedabove,thereisanumberoffutureresearchdirectionsthatrequirefurtherinvestigation.Thissectionpresentsthefutureworkregardingeachpartofthethesis.

ConceptualSpaces:ConstructionandInference

Onepossibledirectionofthefutureworkistofocusonacquiringsemanticfeaturesthatarecontextuallyunderstandablebythehumanandareabletodifferentiatedistinctconceptsinthedomain[77,96,217].Therehasbeenapreliminaryongoingstudyonthistopicbytheauthorofthisthesis,whichneedstobeextendedandbeevaluated.Twopossibleapproachescanbeap-pliedforthetaskofsemanticfeatureacquisition.Onewayistospecifytheattributescomingfromthehumanperceptionsofinstancesthatbridgesbe-


Figure 10.1: Human-specified semantic features for animal domain.

tween low-level features and linguistic words [204]. Directly asking the expertsor users to specify the attributes is an obvious way to provide both understand-able and discriminative features. As an example one can differ between differ-ent animals by specifying the semantic features like hair type, domestic/wild, orhaving horns (See Figure 10.1 for some examples of human-specified semanticfeatures). Another way is to verify the attributes that are scientifically measur-able and potentially interpretable in natural language but are not obvious tospecify by humans in the first glance. Such examples of these features for an-imal domain are agility or hibernation which are discriminative features forcategorising animal species [135]. These ways of acquiring semantic featuresthen guarantee that the input features to the construction of conceptual spacesare human explainable information that can be later used for inferring linguisticdescriptions of unknown observations as well.

Construction of conceptual spaces has the advantage of using labelled inputobservations to form the concepts. Another way to look at the problem ofturning numerical data into the conceptual spaces is to deal with unlabelleddata sets. In this case, there will be no pre-defined classes or concepts for thedata. An unsupervised approach to specify the domains and quality dimensionscan be a new way to build a data-driven conceptual space. This suggestionmight need to use clustering methods and clustering index criteria to form theclusters of data as concepts that are not labelled by any specific term, but stillare explainable by their quality dimensions.

Moreover, the proposed concept representation provides a set of geometri-cal domains, which has been used in the inference process. This representationhas the potential to be utilised for other cognitive tasks such as concept combi-nation, inductive inference, and property reasoning. This would be a promisingresearch direction, especially when it comes to data-driven representations ofthe cognitive architectures.

While evaluating the proposed approach, a new hypothesis has been raisedrelated to the topic of referring expression generation in the NLG area. The ideaof comparing the descriptions from multi-domain spaces versus from single-


Figure10.1:Human-specifiedsemanticfeaturesforanimaldomain.

tweenlow-levelfeaturesandlinguisticwords[204].Directlyaskingtheexpertsoruserstospecifytheattributesisanobviouswaytoprovidebothunderstand-ableanddiscriminativefeatures.Asanexampleonecandifferbetweendiffer-entanimalsbyspecifyingthesemanticfeatureslikehairtype,domestic/wild,orhavinghorns(SeeFigure10.1forsomeexamplesofhuman-specifiedsemanticfeatures).Anotherwayistoverifytheattributesthatarescientificallymeasur-ableandpotentiallyinterpretableinnaturallanguagebutarenotobvioustospecifybyhumansinthefirstglance.Suchexamplesofthesefeaturesforan-imaldomainareagilityorhibernationwhicharediscriminativefeaturesforcategorisinganimalspecies[135].Thesewaysofacquiringsemanticfeaturesthenguaranteethattheinputfeaturestotheconstructionofconceptualspacesarehumanexplainableinformationthatcanbelaterusedforinferringlinguisticdescriptionsofunknownobservationsaswell.

Constructionofconceptualspaceshastheadvantageofusinglabelledinputobservationstoformtheconcepts.Anotherwaytolookattheproblemofturningnumericaldataintotheconceptualspacesistodealwithunlabelleddatasets.Inthiscase,therewillbenopre-definedclassesorconceptsforthedata.Anunsupervisedapproachtospecifythedomainsandqualitydimensionscanbeanewwaytobuildadata-drivenconceptualspace.Thissuggestionmightneedtouseclusteringmethodsandclusteringindexcriteriatoformtheclustersofdataasconceptsthatarenotlabelledbyanyspecificterm,butstillareexplainablebytheirqualitydimensions.

Moreover,theproposedconceptrepresentationprovidesasetofgeometri-caldomains,whichhasbeenusedintheinferenceprocess.Thisrepresentationhasthepotentialtobeutilisedforothercognitivetaskssuchasconceptcombi-nation,inductiveinference,andpropertyreasoning.Thiswouldbeapromisingresearchdirection,especiallywhenitcomestodata-drivenrepresentationsofthecognitivearchitectures.

Whileevaluatingtheproposedapproach,anewhypothesishasbeenraisedrelatedtothetopicofreferringexpressiongenerationintheNLGarea.Theideaofcomparingthedescriptionsfrommulti-domainspacesversusfromsingle-


Figure 10.1: Human-specified semantic features for animal domain.

tween low-level features and linguistic words [204]. Directly asking the expertsor users to specify the attributes is an obvious way to provide both understand-able and discriminative features. As an example one can differ between differ-ent animals by specifying the semantic features like hair type, domestic/wild, orhaving horns (See Figure 10.1 for some examples of human-specified semanticfeatures). Another way is to verify the attributes that are scientifically measur-able and potentially interpretable in natural language but are not obvious tospecify by humans in the first glance. Such examples of these features for an-imal domain are agility or hibernation which are discriminative features forcategorising animal species [135]. These ways of acquiring semantic featuresthen guarantee that the input features to the construction of conceptual spacesare human explainable information that can be later used for inferring linguisticdescriptions of unknown observations as well.

Construction of conceptual spaces has the advantage of using labelled inputobservations to form the concepts. Another way to look at the problem ofturning numerical data into the conceptual spaces is to deal with unlabelleddata sets. In this case, there will be no pre-defined classes or concepts for thedata. An unsupervised approach to specify the domains and quality dimensionscan be a new way to build a data-driven conceptual space. This suggestionmight need to use clustering methods and clustering index criteria to form theclusters of data as concepts that are not labelled by any specific term, but stillare explainable by their quality dimensions.

Moreover, the proposed concept representation provides a set of geometri-cal domains, which has been used in the inference process. This representationhas the potential to be utilised for other cognitive tasks such as concept combi-nation, inductive inference, and property reasoning. This would be a promisingresearch direction, especially when it comes to data-driven representations ofthe cognitive architectures.

While evaluating the proposed approach, a new hypothesis has been raisedrelated to the topic of referring expression generation in the NLG area. The ideaof comparing the descriptions from multi-domain spaces versus from single-


Figure10.1:Human-specifiedsemanticfeaturesforanimaldomain.

tweenlow-levelfeaturesandlinguisticwords[204].Directlyaskingtheexpertsoruserstospecifytheattributesisanobviouswaytoprovidebothunderstand-ableanddiscriminativefeatures.Asanexampleonecandifferbetweendiffer-entanimalsbyspecifyingthesemanticfeatureslikehairtype,domestic/wild,orhavinghorns(SeeFigure10.1forsomeexamplesofhuman-specifiedsemanticfeatures).Anotherwayistoverifytheattributesthatarescientificallymeasur-ableandpotentiallyinterpretableinnaturallanguagebutarenotobvioustospecifybyhumansinthefirstglance.Suchexamplesofthesefeaturesforan-imaldomainareagilityorhibernationwhicharediscriminativefeaturesforcategorisinganimalspecies[135].Thesewaysofacquiringsemanticfeaturesthenguaranteethattheinputfeaturestotheconstructionofconceptualspacesarehumanexplainableinformationthatcanbelaterusedforinferringlinguisticdescriptionsofunknownobservationsaswell.

Constructionofconceptualspaceshastheadvantageofusinglabelledinputobservationstoformtheconcepts.Anotherwaytolookattheproblemofturningnumericaldataintotheconceptualspacesistodealwithunlabelleddatasets.Inthiscase,therewillbenopre-definedclassesorconceptsforthedata.Anunsupervisedapproachtospecifythedomainsandqualitydimensionscanbeanewwaytobuildadata-drivenconceptualspace.Thissuggestionmightneedtouseclusteringmethodsandclusteringindexcriteriatoformtheclustersofdataasconceptsthatarenotlabelledbyanyspecificterm,butstillareexplainablebytheirqualitydimensions.

Moreover,theproposedconceptrepresentationprovidesasetofgeometri-caldomains,whichhasbeenusedintheinferenceprocess.Thisrepresentationhasthepotentialtobeutilisedforothercognitivetaskssuchasconceptcombi-nation,inductiveinference,andpropertyreasoning.Thiswouldbeapromisingresearchdirection,especiallywhenitcomestodata-drivenrepresentationsofthecognitivearchitectures.

Whileevaluatingtheproposedapproach,anewhypothesishasbeenraisedrelatedtothetopicofreferringexpressiongenerationintheNLGarea.Theideaofcomparingthedescriptionsfrommulti-domainspacesversusfromsingle-


domain spaces brings up the question of referring to an unknown object bywhether describing its features or mentioning its closest known concepts. Fromthe cognitive point of view, one question is how humans prefer to use the fea-tures in order to refer to an object. One possible solution is to to use a mixtureof known associated concepts and the relevant descriptive features. For exam-ple, in the domain of animals, one can describe a Unicorn (an unknown animallet’s say) as “This animal is like a Horse (a known class of animals), but it hasa single large horn (a feature for animals) on its head.” Throughout the eval-uation of this work, some textual results have been provided to support thishypothesis, along with some comparisons using an empirical evaluation. How-ever, this hypothesis is not formalised as a new aspect of referring expressiongeneration in NLG.

Another direction for the future work would be to assess the quality ofthe automatically derived domains and dimensions. As Gärdenfors discussedin [88], there is a need to determine evaluation criteria to choose among com-peting conceptual spaces. The proposed framework has the potential to addressthis need by defining statistical measures to compare the specified domains in adata-driven manner.

Physiological Sensor Data: Mining and Describing

Turning to the data analysis of the physiological sensor data, there are manyfuture research directions to improve or extend the proposed approaches. Oneof the critical updates might be to extend the pattern abstraction approach todeal with newly recorded data in a streaming data set. In the current version,the prototypical patterns are abstracted by analysing the entire time series atonce. It could be interesting to extend the algorithm to incrementally updatethe patterns using the new bunch of recorded data. This extension will alsohelp to handle any large-scale recording sensor data that is crucial for rulemining approach.

Additionally, involving the causality to the process of rule mining is anotherdirection of research, which impacts profoundly on the level of explainabilityof the derived information. One extension would be to use data-driven waysto identify causes and effects of the patterns’ behaviours using causal inferenceapproaches and then describing them in the form of e.g., if-else statements.This identification will help the system to obtain more enriched (still unseen,but more interesting) information.

The proposed approach for generating linguistic descriptions of the phys-iological patterns has the ability to be improved in future research. First, itwould be worth to compare the generated text by conceptual spaces with thetemplate-based generated text. Although these two approaches aim to capturethe same behaviours of the pattern, still they might be differently accepted orused by the end user. Second, there is a lack of extrinsic evaluation of the frame-work in a real-world application within the medical domain to see the actual


domainspacesbringsupthequestionofreferringtoanunknownobjectbywhetherdescribingitsfeaturesormentioningitsclosestknownconcepts.Fromthecognitivepointofview,onequestionishowhumansprefertousethefea-turesinordertorefertoanobject.Onepossiblesolutionistotouseamixtureofknownassociatedconceptsandtherelevantdescriptivefeatures.Forexam-ple,inthedomainofanimals,onecandescribeaUnicorn(anunknownanimallet’ssay)as“ThisanimalislikeaHorse(aknownclassofanimals),butithasasinglelargehorn(afeatureforanimals)onitshead.”Throughouttheeval-uationofthiswork,sometextualresultshavebeenprovidedtosupportthishypothesis,alongwithsomecomparisonsusinganempiricalevaluation.How-ever,thishypothesisisnotformalisedasanewaspectofreferringexpressiongenerationinNLG.

Anotherdirectionforthefutureworkwouldbetoassessthequalityoftheautomaticallyderiveddomainsanddimensions.AsGärdenforsdiscussedin[88],thereisaneedtodetermineevaluationcriteriatochooseamongcom-petingconceptualspaces.Theproposedframeworkhasthepotentialtoaddressthisneedbydefiningstatisticalmeasurestocomparethespecifieddomainsinadata-drivenmanner.

PhysiologicalSensorData:MiningandDescribing

Turningtothedataanalysisofthephysiologicalsensordata,therearemanyfutureresearchdirectionstoimproveorextendtheproposedapproaches.Oneofthecriticalupdatesmightbetoextendthepatternabstractionapproachtodealwithnewlyrecordeddatainastreamingdataset.Inthecurrentversion,theprototypicalpatternsareabstractedbyanalysingtheentiretimeseriesatonce.Itcouldbeinterestingtoextendthealgorithmtoincrementallyupdatethepatternsusingthenewbunchofrecordeddata.Thisextensionwillalsohelptohandleanylarge-scalerecordingsensordatathatiscrucialforruleminingapproach.

Additionally,involvingthecausalitytotheprocessofruleminingisanotherdirectionofresearch,whichimpactsprofoundlyonthelevelofexplainabilityofthederivedinformation.Oneextensionwouldbetousedata-drivenwaystoidentifycausesandeffectsofthepatterns’behavioursusingcausalinferenceapproachesandthendescribingthemintheformofe.g.,if-elsestatements.Thisidentificationwillhelpthesystemtoobtainmoreenriched(stillunseen,butmoreinteresting)information.

Theproposedapproachforgeneratinglinguisticdescriptionsofthephys-iologicalpatternshastheabilitytobeimprovedinfutureresearch.First,itwouldbeworthtocomparethegeneratedtextbyconceptualspaceswiththetemplate-basedgeneratedtext.Althoughthesetwoapproachesaimtocapturethesamebehavioursofthepattern,stilltheymightbedifferentlyacceptedorusedbytheenduser.Second,thereisalackofextrinsicevaluationoftheframe-workinareal-worldapplicationwithinthemedicaldomaintoseetheactual


domain spaces brings up the question of referring to an unknown object bywhether describing its features or mentioning its closest known concepts. Fromthe cognitive point of view, one question is how humans prefer to use the fea-tures in order to refer to an object. One possible solution is to to use a mixtureof known associated concepts and the relevant descriptive features. For exam-ple, in the domain of animals, one can describe a Unicorn (an unknown animallet’s say) as “This animal is like a Horse (a known class of animals), but it hasa single large horn (a feature for animals) on its head.” Throughout the eval-uation of this work, some textual results have been provided to support thishypothesis, along with some comparisons using an empirical evaluation. How-ever, this hypothesis is not formalised as a new aspect of referring expressiongeneration in NLG.

Another direction for the future work would be to assess the quality ofthe automatically derived domains and dimensions. As Gärdenfors discussedin [88], there is a need to determine evaluation criteria to choose among com-peting conceptual spaces. The proposed framework has the potential to addressthis need by defining statistical measures to compare the specified domains in adata-driven manner.

Physiological Sensor Data: Mining and Describing

Turning to the data analysis of the physiological sensor data, there are manyfuture research directions to improve or extend the proposed approaches. Oneof the critical updates might be to extend the pattern abstraction approach todeal with newly recorded data in a streaming data set. In the current version,the prototypical patterns are abstracted by analysing the entire time series atonce. It could be interesting to extend the algorithm to incrementally updatethe patterns using the new bunch of recorded data. This extension will alsohelp to handle any large-scale recording sensor data that is crucial for rulemining approach.

Additionally, involving the causality to the process of rule mining is anotherdirection of research, which impacts profoundly on the level of explainabilityof the derived information. One extension would be to use data-driven waysto identify causes and effects of the patterns’ behaviours using causal inferenceapproaches and then describing them in the form of e.g., if-else statements.This identification will help the system to obtain more enriched (still unseen,but more interesting) information.

The proposed approach for generating linguistic descriptions of the phys-iological patterns has the ability to be improved in future research. First, itwould be worth to compare the generated text by conceptual spaces with thetemplate-based generated text. Although these two approaches aim to capturethe same behaviours of the pattern, still they might be differently accepted orused by the end user. Second, there is a lack of extrinsic evaluation of the frame-work in a real-world application within the medical domain to see the actual


domainspacesbringsupthequestionofreferringtoanunknownobjectbywhetherdescribingitsfeaturesormentioningitsclosestknownconcepts.Fromthecognitivepointofview,onequestionishowhumansprefertousethefea-turesinordertorefertoanobject.Onepossiblesolutionistotouseamixtureofknownassociatedconceptsandtherelevantdescriptivefeatures.Forexam-ple,inthedomainofanimals,onecandescribeaUnicorn(anunknownanimallet’ssay)as“ThisanimalislikeaHorse(aknownclassofanimals),butithasasinglelargehorn(afeatureforanimals)onitshead.”Throughouttheeval-uationofthiswork,sometextualresultshavebeenprovidedtosupportthishypothesis,alongwithsomecomparisonsusinganempiricalevaluation.How-ever,thishypothesisisnotformalisedasanewaspectofreferringexpressiongenerationinNLG.

Anotherdirectionforthefutureworkwouldbetoassessthequalityoftheautomaticallyderiveddomainsanddimensions.AsGärdenforsdiscussedin[88],thereisaneedtodetermineevaluationcriteriatochooseamongcom-petingconceptualspaces.Theproposedframeworkhasthepotentialtoaddressthisneedbydefiningstatisticalmeasurestocomparethespecifieddomainsinadata-drivenmanner.

PhysiologicalSensorData:MiningandDescribing

Turningtothedataanalysisofthephysiologicalsensordata,therearemanyfutureresearchdirectionstoimproveorextendtheproposedapproaches.Oneofthecriticalupdatesmightbetoextendthepatternabstractionapproachtodealwithnewlyrecordeddatainastreamingdataset.Inthecurrentversion,theprototypicalpatternsareabstractedbyanalysingtheentiretimeseriesatonce.Itcouldbeinterestingtoextendthealgorithmtoincrementallyupdatethepatternsusingthenewbunchofrecordeddata.Thisextensionwillalsohelptohandleanylarge-scalerecordingsensordatathatiscrucialforruleminingapproach.

Additionally,involvingthecausalitytotheprocessofruleminingisanotherdirectionofresearch,whichimpactsprofoundlyonthelevelofexplainabilityofthederivedinformation.Oneextensionwouldbetousedata-drivenwaystoidentifycausesandeffectsofthepatterns’behavioursusingcausalinferenceapproachesandthendescribingthemintheformofe.g.,if-elsestatements.Thisidentificationwillhelpthesystemtoobtainmoreenriched(stillunseen,butmoreinteresting)information.

Theproposedapproachforgeneratinglinguisticdescriptionsofthephys-iologicalpatternshastheabilitytobeimprovedinfutureresearch.First,itwouldbeworthtocomparethegeneratedtextbyconceptualspaceswiththetemplate-basedgeneratedtext.Althoughthesetwoapproachesaimtocapturethesamebehavioursofthepattern,stilltheymightbedifferentlyacceptedorusedbytheenduser.Second,thereisalackofextrinsicevaluationoftheframe-workinareal-worldapplicationwithinthemedicaldomaintoseetheactual


impact of the generated text to experts such as clinicians, medical doctors, oreven caregivers. This task-based evaluation is needed to assess the usability ofthe generated text for patterns in the real-world, meaning that how much thisdata-driven information (in the form of natural language) are interesting or/andhelpful to the end users for any further decision making task. Here, the focusin the evaluation studies has largely been on identification of the observations,but adapting the proposed methods to other uses, such as decision support in amedical setting, or other audiences, such as experts versus non-experts, wouldbe an interesting road for future work.

In a more general perspective, the approaches for generating linguistic de-scriptions and natural language generation can be more involved in the field ofhealthcare monitoring. Besides considering the wearable sensor data, there ismuch other information within a healthcare system that will be more accept-able by using the natural language to explain them. This information such asmedical history, environmental sensors, activity records, personal reports, etc.(which are often ontological knowledge) can be merged with the data-driveninformation in order to analyse the health monitoring scenarios and to per-form high-level reasoning for medical purposes. Then, it would be worth touse approaches to generate linguistic descriptions in order to explain such in-ferred or reasoned information in a natural language which is understandableby humans.


impactofthegeneratedtexttoexpertssuchasclinicians,medicaldoctors,orevencaregivers.Thistask-basedevaluationisneededtoassesstheusabilityofthegeneratedtextforpatternsinthereal-world,meaningthathowmuchthisdata-driveninformation(intheformofnaturallanguage)areinterestingor/andhelpfultotheendusersforanyfurtherdecisionmakingtask.Here,thefocusintheevaluationstudieshaslargelybeenonidentificationoftheobservations,butadaptingtheproposedmethodstootheruses,suchasdecisionsupportinamedicalsetting,orotheraudiences,suchasexpertsversusnon-experts,wouldbeaninterestingroadforfuturework.

Inamoregeneralperspective,theapproachesforgeneratinglinguisticde-scriptionsandnaturallanguagegenerationcanbemoreinvolvedinthefieldofhealthcaremonitoring.Besidesconsideringthewearablesensordata,thereismuchotherinformationwithinahealthcaresystemthatwillbemoreaccept-ablebyusingthenaturallanguagetoexplainthem.Thisinformationsuchasmedicalhistory,environmentalsensors,activityrecords,personalreports,etc.(whichareoftenontologicalknowledge)canbemergedwiththedata-driveninformationinordertoanalysethehealthmonitoringscenariosandtoper-formhigh-levelreasoningformedicalpurposes.Then,itwouldbeworthtouseapproachestogeneratelinguisticdescriptionsinordertoexplainsuchin-ferredorreasonedinformationinanaturallanguagewhichisunderstandablebyhumans.


impact of the generated text to experts such as clinicians, medical doctors, oreven caregivers. This task-based evaluation is needed to assess the usability ofthe generated text for patterns in the real-world, meaning that how much thisdata-driven information (in the form of natural language) are interesting or/andhelpful to the end users for any further decision making task. Here, the focusin the evaluation studies has largely been on identification of the observations,but adapting the proposed methods to other uses, such as decision support in amedical setting, or other audiences, such as experts versus non-experts, wouldbe an interesting road for future work.

In a more general perspective, the approaches for generating linguistic de-scriptions and natural language generation can be more involved in the field ofhealthcare monitoring. Besides considering the wearable sensor data, there ismuch other information within a healthcare system that will be more accept-able by using the natural language to explain them. This information such asmedical history, environmental sensors, activity records, personal reports, etc.(which are often ontological knowledge) can be merged with the data-driveninformation in order to analyse the health monitoring scenarios and to per-form high-level reasoning for medical purposes. Then, it would be worth touse approaches to generate linguistic descriptions in order to explain such in-ferred or reasoned information in a natural language which is understandableby humans.


impactofthegeneratedtexttoexpertssuchasclinicians,medicaldoctors,orevencaregivers.Thistask-basedevaluationisneededtoassesstheusabilityofthegeneratedtextforpatternsinthereal-world,meaningthathowmuchthisdata-driveninformation(intheformofnaturallanguage)areinterestingor/andhelpfultotheendusersforanyfurtherdecisionmakingtask.Here,thefocusintheevaluationstudieshaslargelybeenonidentificationoftheobservations,butadaptingtheproposedmethodstootheruses,suchasdecisionsupportinamedicalsetting,orotheraudiences,suchasexpertsversusnon-experts,wouldbeaninterestingroadforfuturework.

Inamoregeneralperspective,theapproachesforgeneratinglinguisticde-scriptionsandnaturallanguagegenerationcanbemoreinvolvedinthefieldofhealthcaremonitoring.Besidesconsideringthewearablesensordata,thereismuchotherinformationwithinahealthcaresystemthatwillbemoreaccept-ablebyusingthenaturallanguagetoexplainthem.Thisinformationsuchasmedicalhistory,environmentalsensors,activityrecords,personalreports,etc.(whichareoftenontologicalknowledge)canbemergedwiththedata-driveninformationinordertoanalysethehealthmonitoringscenariosandtoper-formhigh-levelreasoningformedicalpurposes.Then,itwouldbeworthtouseapproachestogeneratelinguisticdescriptionsinordertoexplainsuchin-ferredorreasonedinformationinanaturallanguagewhichisunderstandablebyhumans.

10.5. FINAL WORDS 163

10.5 Final Words

To conclude this thesis, let’s revisit the story of The Elephant in the Dark andthe problem of perception limitations or, as called in this thesis, the problemof concept description. The proposed approach in this thesis made valuablecontributions to address this problem by applying data-driven methods to linkunknown perceived data to understandable linguistic descriptions. This thesiscan be called a success if a reader of the thesis will be able to “adequately”describe an elephant to a person who has neither encountered the notion of anelephant previously or seen one in real life. Hopefully, such a description wouldbe better than the one shown in Figure 10.2.

Figure 10.2: Yet another illustration for the story of The Elephant in the Dark,adapted from [194].

10.5.FINALWORDS163

10.5FinalWords

Toconcludethisthesis,let’srevisitthestoryofTheElephantintheDarkandtheproblemofperceptionlimitationsor,ascalledinthisthesis,theproblemofconceptdescription.Theproposedapproachinthisthesismadevaluablecontributionstoaddressthisproblembyapplyingdata-drivenmethodstolinkunknownperceiveddatatounderstandablelinguisticdescriptions.Thisthesiscanbecalledasuccessifareaderofthethesiswillbeableto“adequately”describeanelephanttoapersonwhohasneitherencounteredthenotionofanelephantpreviouslyorseenoneinreallife.Hopefully,suchadescriptionwouldbebetterthantheoneshowninFigure10.2.

Figure10.2:YetanotherillustrationforthestoryofTheElephantintheDark,adaptedfrom[194].

10.5. FINAL WORDS 163

10.5 Final Words

To conclude this thesis, let’s revisit the story of The Elephant in the Dark andthe problem of perception limitations or, as called in this thesis, the problemof concept description. The proposed approach in this thesis made valuablecontributions to address this problem by applying data-driven methods to linkunknown perceived data to understandable linguistic descriptions. This thesiscan be called a success if a reader of the thesis will be able to “adequately”describe an elephant to a person who has neither encountered the notion of anelephant previously or seen one in real life. Hopefully, such a description wouldbe better than the one shown in Figure 10.2.

Figure 10.2: Yet another illustration for the story of The Elephant in the Dark,adapted from [194].

10.5.FINALWORDS163

10.5FinalWords

Toconcludethisthesis,let’srevisitthestoryofTheElephantintheDarkandtheproblemofperceptionlimitationsor,ascalledinthisthesis,theproblemofconceptdescription.Theproposedapproachinthisthesismadevaluablecontributionstoaddressthisproblembyapplyingdata-drivenmethodstolinkunknownperceiveddatatounderstandablelinguisticdescriptions.Thisthesiscanbecalledasuccessifareaderofthethesiswillbeableto“adequately”describeanelephanttoapersonwhohasneitherencounteredthenotionofanelephantpreviouslyorseenoneinreallife.Hopefully,suchadescriptionwouldbebetterthantheoneshowninFigure10.2.

Figure10.2:YetanotherillustrationforthestoryofTheElephantintheDark,adaptedfrom[194].

Bibliography

[1] Blind men and the elephant. Available online: http://www.allaboutphilosophy.org/blind-men-and-the-elephant.htm (Ac-cessed March 3, 2018). (Cited on page 2.)

[2] Eu general data protection regulation (gdpr). Available online: https://www.eugdpr.org/. (Accessed March 5, 2018). (Cited on page 159.)

[3] Mimic II data set. Available online: physionet.org/physiobank/database/mimicdb/numerics. (Accessed March 3, 2018). (Cited onpage 107.)

[4] Physiobank archive index. Available online: http://www.physionet.org/physiobank/database/ (Accessed March 3, 2018). (Cited on page99.)

[5] Zephyr™ bioharness3. Available online: http://www.zephyr-technology.nl/en/product/71/zephyr-bioharness.html.(Accessed April 10, 2013). (Cited on pages x and 106.)

[6] Zephyr™ strap. Available online: https://www.zephyranywhere.com/online-store. (Accessed November 29, 2016). (Cited on page 106.)

[7] The Heath Readers by Grades, Book Five. (public domain image onpage 69), Boston: D.C. Heath & co., Boston, USA, 1907. (Cited onpages ix and 1.)

[8] Benjamin Adams and Martin Raubal. Conceptual space markup lan-guage (csml): Towards the cognitive semantic web. In IEEE Interna-tional Conference on Semantic Computing (ICSC) 2009, pages 253–260,2009. (Cited on pages 25 and 51.)

[9] Benjamin Adams and Martin Raubal. A metric conceptual space algebra.In Spatial information theory, pages 51–68. Springer, 2009. (Cited onpages 21, 35, 47, 56, 62, and 66.)

165

Bibliography

[1]Blindmenandtheelephant.Availableonline:http://www.allaboutphilosophy.org/blind-men-and-the-elephant.htm(Ac-cessedMarch3,2018).(Citedonpage2.)

[2]Eugeneraldataprotectionregulation(gdpr).Availableonline:https://www.eugdpr.org/.(AccessedMarch5,2018).(Citedonpage159.)

[3]MimicIIdataset.Availableonline:physionet.org/physiobank/database/mimicdb/numerics.(AccessedMarch3,2018).(Citedonpage107.)

[4]Physiobankarchiveindex.Availableonline:http://www.physionet.org/physiobank/database/(AccessedMarch3,2018).(Citedonpage99.)

[5]Zephyr™bioharness3.Availableonline:http://www.zephyr-technology.nl/en/product/71/zephyr-bioharness.html.(AccessedApril10,2013).(Citedonpagesxand106.)

[6]Zephyr™strap.Availableonline:https://www.zephyranywhere.com/online-store.(AccessedNovember29,2016).(Citedonpage106.)

[7]TheHeathReadersbyGrades,BookFive.(publicdomainimageonpage69),Boston:D.C.Heath&co.,Boston,USA,1907.(Citedonpagesixand1.)

[8]BenjaminAdamsandMartinRaubal.Conceptualspacemarkuplan-guage(csml):Towardsthecognitivesemanticweb.InIEEEInterna-tionalConferenceonSemanticComputing(ICSC)2009,pages253–260,2009.(Citedonpages25and51.)

[9]BenjaminAdamsandMartinRaubal.Ametricconceptualspacealgebra.InSpatialinformationtheory,pages51–68.Springer,2009.(Citedonpages21,35,47,56,62,and66.)

165

Bibliography

[1] Blind men and the elephant. Available online: http://www.allaboutphilosophy.org/blind-men-and-the-elephant.htm (Ac-cessed March 3, 2018). (Cited on page 2.)

[2] Eu general data protection regulation (gdpr). Available online: https://www.eugdpr.org/. (Accessed March 5, 2018). (Cited on page 159.)

[3] Mimic II data set. Available online: physionet.org/physiobank/database/mimicdb/numerics. (Accessed March 3, 2018). (Cited onpage 107.)

[4] Physiobank archive index. Available online: http://www.physionet.org/physiobank/database/ (Accessed March 3, 2018). (Cited on page99.)

[5] Zephyr™ bioharness3. Available online: http://www.zephyr-technology.nl/en/product/71/zephyr-bioharness.html.(Accessed April 10, 2013). (Cited on pages x and 106.)

[6] Zephyr™ strap. Available online: https://www.zephyranywhere.com/online-store. (Accessed November 29, 2016). (Cited on page 106.)

[7] The Heath Readers by Grades, Book Five. (public domain image onpage 69), Boston: D.C. Heath & co., Boston, USA, 1907. (Cited onpages ix and 1.)

[8] Benjamin Adams and Martin Raubal. Conceptual space markup lan-guage (csml): Towards the cognitive semantic web. In IEEE Interna-tional Conference on Semantic Computing (ICSC) 2009, pages 253–260,2009. (Cited on pages 25 and 51.)

[9] Benjamin Adams and Martin Raubal. A metric conceptual space algebra.In Spatial information theory, pages 51–68. Springer, 2009. (Cited onpages 21, 35, 47, 56, 62, and 66.)

165

Bibliography

[1]Blindmenandtheelephant.Availableonline:http://www.allaboutphilosophy.org/blind-men-and-the-elephant.htm(Ac-cessedMarch3,2018).(Citedonpage2.)

[2]Eugeneraldataprotectionregulation(gdpr).Availableonline:https://www.eugdpr.org/.(AccessedMarch5,2018).(Citedonpage159.)

[3]MimicIIdataset.Availableonline:physionet.org/physiobank/database/mimicdb/numerics.(AccessedMarch3,2018).(Citedonpage107.)

[4]Physiobankarchiveindex.Availableonline:http://www.physionet.org/physiobank/database/(AccessedMarch3,2018).(Citedonpage99.)

[5]Zephyr™bioharness3.Availableonline:http://www.zephyr-technology.nl/en/product/71/zephyr-bioharness.html.(AccessedApril10,2013).(Citedonpagesxand106.)

[6]Zephyr™strap.Availableonline:https://www.zephyranywhere.com/online-store.(AccessedNovember29,2016).(Citedonpage106.)

[7]TheHeathReadersbyGrades,BookFive.(publicdomainimageonpage69),Boston:D.C.Heath&co.,Boston,USA,1907.(Citedonpagesixand1.)

[8]BenjaminAdamsandMartinRaubal.Conceptualspacemarkuplan-guage(csml):Towardsthecognitivesemanticweb.InIEEEInterna-tionalConferenceonSemanticComputing(ICSC)2009,pages253–260,2009.(Citedonpages25and51.)

[9]BenjaminAdamsandMartinRaubal.Ametricconceptualspacealgebra.InSpatialinformationtheory,pages51–68.Springer,2009.(Citedonpages21,35,47,56,62,and66.)

165

166 BIBLIOGRAPHY

[10] Francesco Agostaro, Agnese Augello, Giovanni Pilato, Giorgio Vassallo,and Salvatore Gaglio. A conversational agent based on a conceptualinterpretation of a data driven semantic space. In Congress of the ItalianAssociation for Artificial Intelligence, pages 381–392. Springer, 2005.(Cited on pages 4, 33, and 35.)

[11] Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining asso-ciation rules between sets of items in large databases. In Acm sigmodrecord, volume 22, pages 207–216. ACM, 1993. (Cited on page 123.)

[12] David W Aha, Dennis Kibler, and Marc K Albert. Instance-based learn-ing algorithms. Machine learning, 6(1):37–66, 1991. (Cited on page26.)

[13] Nor Faizah Ahmad, Doan B. Hoang, and M. Hoang Phung. Robustpreprocessing for health care monitoring framework. In Proceedings ofthe 11th international conference on e-Health networking, applicationsand services, pages 169–174, Piscataway, NJ, USA, 2009. IEEE Press.(Cited on pages 97 and 100.)

[14] Janet Aisbett and Greg Gibbon. A general formulation of conceptualspaces as a meso level representation. Artificial Intelligence, 133(1):189–232, 2001. (Cited on pages 3, 20, 21, 24, 25, 26, 35, 53, 54, 55, and 62.)

[15] Janet Aisbett, John T Rickard, and Greg Gibbon. Conceptual spacesand computing with words. In Applications of Conceptual Spaces, pages123–139. Springer, 2015. (Cited on page 33.)

[16] Ahmad A. Al-Hajji. Rule-based expert system for diagnosis and symp-tom of neurological disorders "neurologist expert system (nes)". In Pro-ceedings of the 1st Taibah University International Conference on Com-puting and Information Technology, pages 67–72, Buraydah, March2012. Qassim University. (Cited on page 98.)

[17] Marjan Alirezaie and Amy Loutfi. Automatic annotation of sensor datastreams using abductive reasoning. In 5th International Conference onKnowledge Engineering and Ontology Development, September 2013.(Cited on page 103.)

[18] James F Allen. Towards a general theory of action and time. Artificialintelligence, 23(2):123–154, 1984. (Cited on page 121.)

[19] Jens S Allwood and Peter Gärdenfors. Cognitive semantics: Meaningand cognition, volume 55. John Benjamins Publishing, 1999. (Cited onpage 20.)

166BIBLIOGRAPHY

[10]FrancescoAgostaro,AgneseAugello,GiovanniPilato,GiorgioVassallo,andSalvatoreGaglio.Aconversationalagentbasedonaconceptualinterpretationofadatadrivensemanticspace.InCongressoftheItalianAssociationforArtificialIntelligence,pages381–392.Springer,2005.(Citedonpages4,33,and35.)

[11]RakeshAgrawal,TomaszImielinski,andArunSwami.Miningasso-ciationrulesbetweensetsofitemsinlargedatabases.InAcmsigmodrecord,volume22,pages207–216.ACM,1993.(Citedonpage123.)

[12]DavidWAha,DennisKibler,andMarcKAlbert.Instance-basedlearn-ingalgorithms.Machinelearning,6(1):37–66,1991.(Citedonpage26.)

[13]NorFaizahAhmad,DoanB.Hoang,andM.HoangPhung.Robustpreprocessingforhealthcaremonitoringframework.InProceedingsofthe11thinternationalconferenceone-Healthnetworking,applicationsandservices,pages169–174,Piscataway,NJ,USA,2009.IEEEPress.(Citedonpages97and100.)

[14]JanetAisbettandGregGibbon.Ageneralformulationofconceptualspacesasamesolevelrepresentation.ArtificialIntelligence,133(1):189–232,2001.(Citedonpages3,20,21,24,25,26,35,53,54,55,and62.)

[15]JanetAisbett,JohnTRickard,andGregGibbon.Conceptualspacesandcomputingwithwords.InApplicationsofConceptualSpaces,pages123–139.Springer,2015.(Citedonpage33.)

[16]AhmadA.Al-Hajji.Rule-basedexpertsystemfordiagnosisandsymp-tomofneurologicaldisorders"neurologistexpertsystem(nes)".InPro-ceedingsofthe1stTaibahUniversityInternationalConferenceonCom-putingandInformationTechnology,pages67–72,Buraydah,March2012.QassimUniversity.(Citedonpage98.)

[17]MarjanAlirezaieandAmyLoutfi.Automaticannotationofsensordatastreamsusingabductivereasoning.In5thInternationalConferenceonKnowledgeEngineeringandOntologyDevelopment,September2013.(Citedonpage103.)

[18]JamesFAllen.Towardsageneraltheoryofactionandtime.Artificialintelligence,23(2):123–154,1984.(Citedonpage121.)

[19]JensSAllwoodandPeterGärdenfors.Cognitivesemantics:Meaningandcognition,volume55.JohnBenjaminsPublishing,1999.(Citedonpage20.)

166 BIBLIOGRAPHY

[10] Francesco Agostaro, Agnese Augello, Giovanni Pilato, Giorgio Vassallo,and Salvatore Gaglio. A conversational agent based on a conceptualinterpretation of a data driven semantic space. In Congress of the ItalianAssociation for Artificial Intelligence, pages 381–392. Springer, 2005.(Cited on pages 4, 33, and 35.)

[11] Rakesh Agrawal, Tomasz Imielinski, and Arun Swami. Mining asso-ciation rules between sets of items in large databases. In Acm sigmodrecord, volume 22, pages 207–216. ACM, 1993. (Cited on page 123.)

[12] David W Aha, Dennis Kibler, and Marc K Albert. Instance-based learn-ing algorithms. Machine learning, 6(1):37–66, 1991. (Cited on page26.)

[13] Nor Faizah Ahmad, Doan B. Hoang, and M. Hoang Phung. Robustpreprocessing for health care monitoring framework. In Proceedings ofthe 11th international conference on e-Health networking, applicationsand services, pages 169–174, Piscataway, NJ, USA, 2009. IEEE Press.(Cited on pages 97 and 100.)

[14] Janet Aisbett and Greg Gibbon. A general formulation of conceptualspaces as a meso level representation. Artificial Intelligence, 133(1):189–232, 2001. (Cited on pages 3, 20, 21, 24, 25, 26, 35, 53, 54, 55, and 62.)

[15] Janet Aisbett, John T Rickard, and Greg Gibbon. Conceptual spacesand computing with words. In Applications of Conceptual Spaces, pages123–139. Springer, 2015. (Cited on page 33.)

[16] Ahmad A. Al-Hajji. Rule-based expert system for diagnosis and symp-tom of neurological disorders "neurologist expert system (nes)". In Pro-ceedings of the 1st Taibah University International Conference on Com-puting and Information Technology, pages 67–72, Buraydah, March2012. Qassim University. (Cited on page 98.)

[17] Marjan Alirezaie and Amy Loutfi. Automatic annotation of sensor datastreams using abductive reasoning. In 5th International Conference onKnowledge Engineering and Ontology Development, September 2013.(Cited on page 103.)

[18] James F Allen. Towards a general theory of action and time. Artificialintelligence, 23(2):123–154, 1984. (Cited on page 121.)

[19] Jens S Allwood and Peter Gärdenfors. Cognitive semantics: Meaningand cognition, volume 55. John Benjamins Publishing, 1999. (Cited onpage 20.)

166BIBLIOGRAPHY

[10]FrancescoAgostaro,AgneseAugello,GiovanniPilato,GiorgioVassallo,andSalvatoreGaglio.Aconversationalagentbasedonaconceptualinterpretationofadatadrivensemanticspace.InCongressoftheItalianAssociationforArtificialIntelligence,pages381–392.Springer,2005.(Citedonpages4,33,and35.)

[11]RakeshAgrawal,TomaszImielinski,andArunSwami.Miningasso-ciationrulesbetweensetsofitemsinlargedatabases.InAcmsigmodrecord,volume22,pages207–216.ACM,1993.(Citedonpage123.)

[12]DavidWAha,DennisKibler,andMarcKAlbert.Instance-basedlearn-ingalgorithms.Machinelearning,6(1):37–66,1991.(Citedonpage26.)

[13]NorFaizahAhmad,DoanB.Hoang,andM.HoangPhung.Robustpreprocessingforhealthcaremonitoringframework.InProceedingsofthe11thinternationalconferenceone-Healthnetworking,applicationsandservices,pages169–174,Piscataway,NJ,USA,2009.IEEEPress.(Citedonpages97and100.)

[14]JanetAisbettandGregGibbon.Ageneralformulationofconceptualspacesasamesolevelrepresentation.ArtificialIntelligence,133(1):189–232,2001.(Citedonpages3,20,21,24,25,26,35,53,54,55,and62.)

[15]JanetAisbett,JohnTRickard,andGregGibbon.Conceptualspacesandcomputingwithwords.InApplicationsofConceptualSpaces,pages123–139.Springer,2015.(Citedonpage33.)

[16]AhmadA.Al-Hajji.Rule-basedexpertsystemfordiagnosisandsymp-tomofneurologicaldisorders"neurologistexpertsystem(nes)".InPro-ceedingsofthe1stTaibahUniversityInternationalConferenceonCom-putingandInformationTechnology,pages67–72,Buraydah,March2012.QassimUniversity.(Citedonpage98.)

[17]MarjanAlirezaieandAmyLoutfi.Automaticannotationofsensordatastreamsusingabductivereasoning.In5thInternationalConferenceonKnowledgeEngineeringandOntologyDevelopment,September2013.(Citedonpage103.)

[18]JamesFAllen.Towardsageneraltheoryofactionandtime.Artificialintelligence,23(2):123–154,1984.(Citedonpage121.)

[19]JensSAllwoodandPeterGärdenfors.Cognitivesemantics:Meaningandcognition,volume55.JohnBenjaminsPublishing,1999.(Citedonpage20.)

BIBLIOGRAPHY 167

[20] Miguel R Álvarez, Paulo FéLix, and PurificacióN CariñEna. Discoveringmetric temporal constraint networks on temporal databases. Artificialintelligence in medicine, 58(3):139–154, 2013. (Cited on page 122.)

[21] Filippo Amato, Alberto López Rodríguez, Eladia María PeñaandMén-dez, Petr Vanhara, Aleš Hampl, and Josef Havel. Artificial neural net-works in medical diagnosis. Journal of Applied Biomedicine, 11:47–58,2013. (Cited on page 97.)

[22] Daniele Apiletti, Elena Baralis, Giulia Bruno, and Tania Cerquitelli.Real-time analysis of physiological data to support medical applica-tions. IEEE Transactions on Information Technology in Biomedicine,13(3):313–321, May 2009. (Cited on pages 96, 97, 99, and 100.)

[23] Louis Atallah, Benny Lo, and Guang-Zhong Yang. Can pervasive sensingaddress current challenges in global healthcare? Journal of Epidemiologyand Global Health, 2(1):1–13, 2012. (Cited on page 92.)

[24] Akin Avci, Stephan Bosch, Mihai Marin-Perianu, Raluca Marin-Perianu,and Paul Havinga. Activity recognition using inertial sensing for health-care, wellbeing and sports applications: A survey. In Proceedings of the23th International Conference on Architecture of Computing Systems,pages 167–176, Berlin, February 2010. VDE Verlag. (Cited on page96.)

[25] Jody Azzouni. Semantic perception: How the illusion of a common lan-guage arises and persists. Oxford University Press, 2015. (Cited on page20.)

[26] Joonbum Bae and Masayoshi Tomizuka. Gait phase analysis based on ahidden markov model. Mechatronics, 21(6):961–970, 2011. (Cited onpage 98.)

[27] MirzaMansoor Baig and Hamid Gholamhosseini. Smart health moni-toring systems: An overview of design and modeling. Journal of MedicalSystems, 37(2):1–14, 2013. (Cited on pages 92 and 93.)

[28] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Data mining forwearable sensors in health monitoring systems: a review of recent trendsand challenges. Sensors, 13(12):17472–17500, 2013. (Cited on pages4, 98, 105, and 106.)

[29] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. A framework forautomatic text generation of trends in physiological time series data. InSystems, Man, and Cybernetics (SMC), 2013 IEEE International Confer-ence on, pages 3876–3881. IEEE, 2013. (Cited on pages 103 and 132.)

BIBLIOGRAPHY167

[20]MiguelRÁlvarez,PauloFéLix,andPurificacióNCariñEna.Discoveringmetrictemporalconstraintnetworksontemporaldatabases.Artificialintelligenceinmedicine,58(3):139–154,2013.(Citedonpage122.)

[21]FilippoAmato,AlbertoLópezRodríguez,EladiaMaríaPeñaandMén-dez,PetrVanhara,AlešHampl,andJosefHavel.Artificialneuralnet-worksinmedicaldiagnosis.JournalofAppliedBiomedicine,11:47–58,2013.(Citedonpage97.)

[22]DanieleApiletti,ElenaBaralis,GiuliaBruno,andTaniaCerquitelli.Real-timeanalysisofphysiologicaldatatosupportmedicalapplica-tions.IEEETransactionsonInformationTechnologyinBiomedicine,13(3):313–321,May2009.(Citedonpages96,97,99,and100.)

[23]LouisAtallah,BennyLo,andGuang-ZhongYang.Canpervasivesensingaddresscurrentchallengesinglobalhealthcare?JournalofEpidemiologyandGlobalHealth,2(1):1–13,2012.(Citedonpage92.)

[24]AkinAvci,StephanBosch,MihaiMarin-Perianu,RalucaMarin-Perianu,andPaulHavinga.Activityrecognitionusinginertialsensingforhealth-care,wellbeingandsportsapplications:Asurvey.InProceedingsofthe23thInternationalConferenceonArchitectureofComputingSystems,pages167–176,Berlin,February2010.VDEVerlag.(Citedonpage96.)

[25]JodyAzzouni.Semanticperception:Howtheillusionofacommonlan-guagearisesandpersists.OxfordUniversityPress,2015.(Citedonpage20.)

[26]JoonbumBaeandMasayoshiTomizuka.Gaitphaseanalysisbasedonahiddenmarkovmodel.Mechatronics,21(6):961–970,2011.(Citedonpage98.)

[27]MirzaMansoorBaigandHamidGholamhosseini.Smarthealthmoni-toringsystems:Anoverviewofdesignandmodeling.JournalofMedicalSystems,37(2):1–14,2013.(Citedonpages92and93.)

[28]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Dataminingforwearablesensorsinhealthmonitoringsystems:areviewofrecenttrendsandchallenges.Sensors,13(12):17472–17500,2013.(Citedonpages4,98,105,and106.)

[29]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Aframeworkforautomatictextgenerationoftrendsinphysiologicaltimeseriesdata.InSystems,Man,andCybernetics(SMC),2013IEEEInternationalConfer-enceon,pages3876–3881.IEEE,2013.(Citedonpages103and132.)

BIBLIOGRAPHY 167

[20] Miguel R Álvarez, Paulo FéLix, and PurificacióN CariñEna. Discoveringmetric temporal constraint networks on temporal databases. Artificialintelligence in medicine, 58(3):139–154, 2013. (Cited on page 122.)

[21] Filippo Amato, Alberto López Rodríguez, Eladia María PeñaandMén-dez, Petr Vanhara, Aleš Hampl, and Josef Havel. Artificial neural net-works in medical diagnosis. Journal of Applied Biomedicine, 11:47–58,2013. (Cited on page 97.)

[22] Daniele Apiletti, Elena Baralis, Giulia Bruno, and Tania Cerquitelli.Real-time analysis of physiological data to support medical applica-tions. IEEE Transactions on Information Technology in Biomedicine,13(3):313–321, May 2009. (Cited on pages 96, 97, 99, and 100.)

[23] Louis Atallah, Benny Lo, and Guang-Zhong Yang. Can pervasive sensingaddress current challenges in global healthcare? Journal of Epidemiologyand Global Health, 2(1):1–13, 2012. (Cited on page 92.)

[24] Akin Avci, Stephan Bosch, Mihai Marin-Perianu, Raluca Marin-Perianu,and Paul Havinga. Activity recognition using inertial sensing for health-care, wellbeing and sports applications: A survey. In Proceedings of the23th International Conference on Architecture of Computing Systems,pages 167–176, Berlin, February 2010. VDE Verlag. (Cited on page96.)

[25] Jody Azzouni. Semantic perception: How the illusion of a common lan-guage arises and persists. Oxford University Press, 2015. (Cited on page20.)

[26] Joonbum Bae and Masayoshi Tomizuka. Gait phase analysis based on ahidden markov model. Mechatronics, 21(6):961–970, 2011. (Cited onpage 98.)

[27] MirzaMansoor Baig and Hamid Gholamhosseini. Smart health moni-toring systems: An overview of design and modeling. Journal of MedicalSystems, 37(2):1–14, 2013. (Cited on pages 92 and 93.)

[28] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Data mining forwearable sensors in health monitoring systems: a review of recent trendsand challenges. Sensors, 13(12):17472–17500, 2013. (Cited on pages4, 98, 105, and 106.)

[29] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. A framework forautomatic text generation of trends in physiological time series data. InSystems, Man, and Cybernetics (SMC), 2013 IEEE International Confer-ence on, pages 3876–3881. IEEE, 2013. (Cited on pages 103 and 132.)

BIBLIOGRAPHY167

[20]MiguelRÁlvarez,PauloFéLix,andPurificacióNCariñEna.Discoveringmetrictemporalconstraintnetworksontemporaldatabases.Artificialintelligenceinmedicine,58(3):139–154,2013.(Citedonpage122.)

[21]FilippoAmato,AlbertoLópezRodríguez,EladiaMaríaPeñaandMén-dez,PetrVanhara,AlešHampl,andJosefHavel.Artificialneuralnet-worksinmedicaldiagnosis.JournalofAppliedBiomedicine,11:47–58,2013.(Citedonpage97.)

[22]DanieleApiletti,ElenaBaralis,GiuliaBruno,andTaniaCerquitelli.Real-timeanalysisofphysiologicaldatatosupportmedicalapplica-tions.IEEETransactionsonInformationTechnologyinBiomedicine,13(3):313–321,May2009.(Citedonpages96,97,99,and100.)

[23]LouisAtallah,BennyLo,andGuang-ZhongYang.Canpervasivesensingaddresscurrentchallengesinglobalhealthcare?JournalofEpidemiologyandGlobalHealth,2(1):1–13,2012.(Citedonpage92.)

[24]AkinAvci,StephanBosch,MihaiMarin-Perianu,RalucaMarin-Perianu,andPaulHavinga.Activityrecognitionusinginertialsensingforhealth-care,wellbeingandsportsapplications:Asurvey.InProceedingsofthe23thInternationalConferenceonArchitectureofComputingSystems,pages167–176,Berlin,February2010.VDEVerlag.(Citedonpage96.)

[25]JodyAzzouni.Semanticperception:Howtheillusionofacommonlan-guagearisesandpersists.OxfordUniversityPress,2015.(Citedonpage20.)

[26]JoonbumBaeandMasayoshiTomizuka.Gaitphaseanalysisbasedonahiddenmarkovmodel.Mechatronics,21(6):961–970,2011.(Citedonpage98.)

[27]MirzaMansoorBaigandHamidGholamhosseini.Smarthealthmoni-toringsystems:Anoverviewofdesignandmodeling.JournalofMedicalSystems,37(2):1–14,2013.(Citedonpages92and93.)

[28]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Dataminingforwearablesensorsinhealthmonitoringsystems:areviewofrecenttrendsandchallenges.Sensors,13(12):17472–17500,2013.(Citedonpages4,98,105,and106.)

[29]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Aframeworkforautomatictextgenerationoftrendsinphysiologicaltimeseriesdata.InSystems,Man,andCybernetics(SMC),2013IEEEInternationalConfer-enceon,pages3876–3881.IEEE,2013.(Citedonpages103and132.)

168 BIBLIOGRAPHY

[30] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Descriptive mod-elling of clinical conditions with data-driven rule mining in physiologicaldata. In HEALTHINF 2015: 8th International Conference on Health In-formatics, 12-15 january, Lisabon, Portugal, 2015. (Cited on pages 125and 127.)

[31] Hadi Banaee and Amy Loutfi. Using conceptual spaces to model domainknowledge in data-to-text systems. In 8th International Natural Lan-guage Generation (INLG) Conference, 19-21 June, Philadelphia, Penn-sylvania, USA, pages 11–15. Association for Computational Linguistics,2014. (Cited on pages 32 and 42.)

[32] Hadi Banaee and Amy Loutfi. Data-driven rule mining and representa-tion of temporal patterns in physiological sensor data. IEEE journal ofbiomedical and health informatics, 19(5):1557–1566, 2015. (Cited onpages 4, 127, 139, and 140.)

[33] Regina Barzilay and Lillian Lee. Catching the drift: Probabilistic contentmodels, with applications to generation and summarization. In Proceed-ings of HLT-NAACL, pages 113–120, 2004. (Cited on page 32.)

[34] Roberto Battiti. Using mutual information for selecting features in su-pervised neural net learning. IEEE Transactions on Neural Networks,5(4):537–550, 1994. (Cited on page 39.)

[35] Ildar Batyrshin, Leonid Sheremetov, and Raul Herrera-Avelar. Perceptionbased patterns in time series data mining. In Perception-based DataMining and Decision Making in Economics and Finance, pages 85–118.Springer, 2007. (Cited on pages 27, 33, 139, and 140.)

[36] Ildar Z Batyrshin and LB Sheremetov. Perception-based approach to timeseries data mining. Applied Soft Computing, 8(3):1211–1221, 2008.(Cited on page 139.)

[37] Riccardo Bellazzi, Fulvia Ferrazzi, and Lucia Sacchi. Predictive data min-ing in clinical medicine: a focus on selected methods and applications.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discov-ery, 1(5):416–430, 2011. (Cited on pages 92, 93, and 94.)

[38] Riccardo Bellazzi and Blaz Zupan. Predictive data mining in clinicalmedicine: Current issues and guidelines. International Journal of Medi-cal Informatics, 77(2):81–97, 2008. (Cited on pages 94 and 97.)

[39] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. Extraction and analysis of features acquired by wear-able sensors network. In 10th IEEE International Conference on Infor-mation Technology and Applications in Biomedicine, pages 1–4, 2010.(Cited on pages 96 and 97.)

168BIBLIOGRAPHY

[30]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Descriptivemod-ellingofclinicalconditionswithdata-drivenrulemininginphysiologicaldata.InHEALTHINF2015:8thInternationalConferenceonHealthIn-formatics,12-15january,Lisabon,Portugal,2015.(Citedonpages125and127.)

[31]HadiBanaeeandAmyLoutfi.Usingconceptualspacestomodeldomainknowledgeindata-to-textsystems.In8thInternationalNaturalLan-guageGeneration(INLG)Conference,19-21June,Philadelphia,Penn-sylvania,USA,pages11–15.AssociationforComputationalLinguistics,2014.(Citedonpages32and42.)

[32]HadiBanaeeandAmyLoutfi.Data-drivenruleminingandrepresenta-tionoftemporalpatternsinphysiologicalsensordata.IEEEjournalofbiomedicalandhealthinformatics,19(5):1557–1566,2015.(Citedonpages4,127,139,and140.)

[33]ReginaBarzilayandLillianLee.Catchingthedrift:Probabilisticcontentmodels,withapplicationstogenerationandsummarization.InProceed-ingsofHLT-NAACL,pages113–120,2004.(Citedonpage32.)

[34]RobertoBattiti.Usingmutualinformationforselectingfeaturesinsu-pervisedneuralnetlearning.IEEETransactionsonNeuralNetworks,5(4):537–550,1994.(Citedonpage39.)

[35]IldarBatyrshin,LeonidSheremetov,andRaulHerrera-Avelar.Perceptionbasedpatternsintimeseriesdatamining.InPerception-basedDataMiningandDecisionMakinginEconomicsandFinance,pages85–118.Springer,2007.(Citedonpages27,33,139,and140.)

[36]IldarZBatyrshinandLBSheremetov.Perception-basedapproachtotimeseriesdatamining.AppliedSoftComputing,8(3):1211–1221,2008.(Citedonpage139.)

[37]RiccardoBellazzi,FulviaFerrazzi,andLuciaSacchi.Predictivedatamin-inginclinicalmedicine:afocusonselectedmethodsandapplications.WileyInterdisciplinaryReviews:DataMiningandKnowledgeDiscov-ery,1(5):416–430,2011.(Citedonpages92,93,and94.)

[38]RiccardoBellazziandBlazZupan.Predictivedatamininginclinicalmedicine:Currentissuesandguidelines.InternationalJournalofMedi-calInformatics,77(2):81–97,2008.(Citedonpages94and97.)

[39]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Extractionandanalysisoffeaturesacquiredbywear-ablesensorsnetwork.In10thIEEEInternationalConferenceonInfor-mationTechnologyandApplicationsinBiomedicine,pages1–4,2010.(Citedonpages96and97.)

168 BIBLIOGRAPHY

[30] Hadi Banaee, Mobyen Uddin Ahmed, and Amy Loutfi. Descriptive mod-elling of clinical conditions with data-driven rule mining in physiologicaldata. In HEALTHINF 2015: 8th International Conference on Health In-formatics, 12-15 january, Lisabon, Portugal, 2015. (Cited on pages 125and 127.)

[31] Hadi Banaee and Amy Loutfi. Using conceptual spaces to model domainknowledge in data-to-text systems. In 8th International Natural Lan-guage Generation (INLG) Conference, 19-21 June, Philadelphia, Penn-sylvania, USA, pages 11–15. Association for Computational Linguistics,2014. (Cited on pages 32 and 42.)

[32] Hadi Banaee and Amy Loutfi. Data-driven rule mining and representa-tion of temporal patterns in physiological sensor data. IEEE journal ofbiomedical and health informatics, 19(5):1557–1566, 2015. (Cited onpages 4, 127, 139, and 140.)

[33] Regina Barzilay and Lillian Lee. Catching the drift: Probabilistic contentmodels, with applications to generation and summarization. In Proceed-ings of HLT-NAACL, pages 113–120, 2004. (Cited on page 32.)

[34] Roberto Battiti. Using mutual information for selecting features in su-pervised neural net learning. IEEE Transactions on Neural Networks,5(4):537–550, 1994. (Cited on page 39.)

[35] Ildar Batyrshin, Leonid Sheremetov, and Raul Herrera-Avelar. Perceptionbased patterns in time series data mining. In Perception-based DataMining and Decision Making in Economics and Finance, pages 85–118.Springer, 2007. (Cited on pages 27, 33, 139, and 140.)

[36] Ildar Z Batyrshin and LB Sheremetov. Perception-based approach to timeseries data mining. Applied Soft Computing, 8(3):1211–1221, 2008.(Cited on page 139.)

[37] Riccardo Bellazzi, Fulvia Ferrazzi, and Lucia Sacchi. Predictive data min-ing in clinical medicine: a focus on selected methods and applications.Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discov-ery, 1(5):416–430, 2011. (Cited on pages 92, 93, and 94.)

[38] Riccardo Bellazzi and Blaz Zupan. Predictive data mining in clinicalmedicine: Current issues and guidelines. International Journal of Medi-cal Informatics, 77(2):81–97, 2008. (Cited on pages 94 and 97.)

[39] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. Extraction and analysis of features acquired by wear-able sensors network. In 10th IEEE International Conference on Infor-mation Technology and Applications in Biomedicine, pages 1–4, 2010.(Cited on pages 96 and 97.)

168BIBLIOGRAPHY

[30]HadiBanaee,MobyenUddinAhmed,andAmyLoutfi.Descriptivemod-ellingofclinicalconditionswithdata-drivenrulemininginphysiologicaldata.InHEALTHINF2015:8thInternationalConferenceonHealthIn-formatics,12-15january,Lisabon,Portugal,2015.(Citedonpages125and127.)

[31]HadiBanaeeandAmyLoutfi.Usingconceptualspacestomodeldomainknowledgeindata-to-textsystems.In8thInternationalNaturalLan-guageGeneration(INLG)Conference,19-21June,Philadelphia,Penn-sylvania,USA,pages11–15.AssociationforComputationalLinguistics,2014.(Citedonpages32and42.)

[32]HadiBanaeeandAmyLoutfi.Data-drivenruleminingandrepresenta-tionoftemporalpatternsinphysiologicalsensordata.IEEEjournalofbiomedicalandhealthinformatics,19(5):1557–1566,2015.(Citedonpages4,127,139,and140.)

[33]ReginaBarzilayandLillianLee.Catchingthedrift:Probabilisticcontentmodels,withapplicationstogenerationandsummarization.InProceed-ingsofHLT-NAACL,pages113–120,2004.(Citedonpage32.)

[34]RobertoBattiti.Usingmutualinformationforselectingfeaturesinsu-pervisedneuralnetlearning.IEEETransactionsonNeuralNetworks,5(4):537–550,1994.(Citedonpage39.)

[35]IldarBatyrshin,LeonidSheremetov,andRaulHerrera-Avelar.Perceptionbasedpatternsintimeseriesdatamining.InPerception-basedDataMiningandDecisionMakinginEconomicsandFinance,pages85–118.Springer,2007.(Citedonpages27,33,139,and140.)

[36]IldarZBatyrshinandLBSheremetov.Perception-basedapproachtotimeseriesdatamining.AppliedSoftComputing,8(3):1211–1221,2008.(Citedonpage139.)

[37]RiccardoBellazzi,FulviaFerrazzi,andLuciaSacchi.Predictivedatamin-inginclinicalmedicine:afocusonselectedmethodsandapplications.WileyInterdisciplinaryReviews:DataMiningandKnowledgeDiscov-ery,1(5):416–430,2011.(Citedonpages92,93,and94.)

[38]RiccardoBellazziandBlazZupan.Predictivedatamininginclinicalmedicine:Currentissuesandguidelines.InternationalJournalofMedi-calInformatics,77(2):81–97,2008.(Citedonpages94and97.)

[39]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Extractionandanalysisoffeaturesacquiredbywear-ablesensorsnetwork.In10thIEEEInternationalConferenceonInfor-mationTechnologyandApplicationsinBiomedicine,pages1–4,2010.(Citedonpages96and97.)

BIBLIOGRAPHY 169

[40] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. Categorization of patients’ health status in copd diseaseusing a wearable platform and random forests methodology. In Proceed-ings of the IEEE International Conference on Biomedical and HealthInformatics, pages 404–407, 2012. (Cited on pages 95 and 98.)

[41] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. A support vector machine approach for categorizationof patients suffering from chronic diseases. In Konstantina S. Nikita,James C. Lin, Dimitrios I. Fotiadis, and Maria-Teresa Arredondo Wald-meyer, editors, Wireless Mobile Communication and Healthcare, vol-ume 83, pages 264–267. Springer Berlin Heidelberg, 2012. (Cited onpages 95 and 101.)

[42] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representationlearning: A review and new perspectives. IEEE transactions on patternanalysis and machine intelligence, 35(8):1798–1828, 2013. (Cited onpage 38.)

[43] Anna Maria Bianchi, Martin Oswaldo Mendez, and Sergio Cerutti. Pro-cessing of signals recorded through smart devices: Sleep-quality assess-ment. IEEE Transactions on Information Technology in Biomedicine,14(3):741–747, 2010. (Cited on pages 95 and 100.)

[44] Fatih Emre Boran, Diyar Akay, and Ronald R. Yager. An overview ofmethods for linguistic summarization with fuzzy sets. Expert Systemswith Applications, 61:356 – 377, 2016. (Cited on page 28.)

[45] Sarah Boyd. Trend: a system for generating intelligent descriptions oftime series data. In IEEE International Conference on Intelligent Pro-cessing Systems (ICIPS1998). Citeseer, 1998. (Cited on page 111.)

[46] João Mario Lopes Brezolin, Sandro Rama Fiorini, Marciade Borba Campos, and Rafael H Bordini. Using conceptual spaces forobject recognition in multi-agent systems. In International Conferenceon Principles and Practice of Multi-Agent Systems, pages 697–705.Springer, 2015. (Cited on page 25.)

[47] Gavin Brown. A new perspective for information theoretic feature selec-tion. In International conference on artificial intelligence and statistics,pages 49–56, 2009. (Cited on page 39.)

[48] Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. Con-ditional likelihood maximisation: a unifying framework for informa-tion theoretic feature selection. Journal of Machine Learning Research,13(Jan):27–66, 2012. (Cited on page 40.)

BIBLIOGRAPHY169

[40]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Categorizationofpatients’healthstatusincopddiseaseusingawearableplatformandrandomforestsmethodology.InProceed-ingsoftheIEEEInternationalConferenceonBiomedicalandHealthInformatics,pages404–407,2012.(Citedonpages95and98.)

[41]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Asupportvectormachineapproachforcategorizationofpatientssufferingfromchronicdiseases.InKonstantinaS.Nikita,JamesC.Lin,DimitriosI.Fotiadis,andMaria-TeresaArredondoWald-meyer,editors,WirelessMobileCommunicationandHealthcare,vol-ume83,pages264–267.SpringerBerlinHeidelberg,2012.(Citedonpages95and101.)

[42]YoshuaBengio,AaronCourville,andPascalVincent.Representationlearning:Areviewandnewperspectives.IEEEtransactionsonpatternanalysisandmachineintelligence,35(8):1798–1828,2013.(Citedonpage38.)

[43]AnnaMariaBianchi,MartinOswaldoMendez,andSergioCerutti.Pro-cessingofsignalsrecordedthroughsmartdevices:Sleep-qualityassess-ment.IEEETransactionsonInformationTechnologyinBiomedicine,14(3):741–747,2010.(Citedonpages95and100.)

[44]FatihEmreBoran,DiyarAkay,andRonaldR.Yager.Anoverviewofmethodsforlinguisticsummarizationwithfuzzysets.ExpertSystemswithApplications,61:356–377,2016.(Citedonpage28.)

[45]SarahBoyd.Trend:asystemforgeneratingintelligentdescriptionsoftimeseriesdata.InIEEEInternationalConferenceonIntelligentPro-cessingSystems(ICIPS1998).Citeseer,1998.(Citedonpage111.)

[46]JoãoMarioLopesBrezolin,SandroRamaFiorini,MarciadeBorbaCampos,andRafaelHBordini.Usingconceptualspacesforobjectrecognitioninmulti-agentsystems.InInternationalConferenceonPrinciplesandPracticeofMulti-AgentSystems,pages697–705.Springer,2015.(Citedonpage25.)

[47]GavinBrown.Anewperspectiveforinformationtheoreticfeatureselec-tion.InInternationalconferenceonartificialintelligenceandstatistics,pages49–56,2009.(Citedonpage39.)

[48]GavinBrown,AdamPocock,Ming-JieZhao,andMikelLuján.Con-ditionallikelihoodmaximisation:aunifyingframeworkforinforma-tiontheoreticfeatureselection.JournalofMachineLearningResearch,13(Jan):27–66,2012.(Citedonpage40.)

BIBLIOGRAPHY 169

[40] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. Categorization of patients’ health status in copd diseaseusing a wearable platform and random forests methodology. In Proceed-ings of the IEEE International Conference on Biomedical and HealthInformatics, pages 404–407, 2012. (Cited on pages 95 and 98.)

[41] Christos C. Bellos, Athanasios Papadopoulos, Roberto Rosso, and Dim-itrios I. Fotiadis. A support vector machine approach for categorizationof patients suffering from chronic diseases. In Konstantina S. Nikita,James C. Lin, Dimitrios I. Fotiadis, and Maria-Teresa Arredondo Wald-meyer, editors, Wireless Mobile Communication and Healthcare, vol-ume 83, pages 264–267. Springer Berlin Heidelberg, 2012. (Cited onpages 95 and 101.)

[42] Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representationlearning: A review and new perspectives. IEEE transactions on patternanalysis and machine intelligence, 35(8):1798–1828, 2013. (Cited onpage 38.)

[43] Anna Maria Bianchi, Martin Oswaldo Mendez, and Sergio Cerutti. Pro-cessing of signals recorded through smart devices: Sleep-quality assess-ment. IEEE Transactions on Information Technology in Biomedicine,14(3):741–747, 2010. (Cited on pages 95 and 100.)

[44] Fatih Emre Boran, Diyar Akay, and Ronald R. Yager. An overview ofmethods for linguistic summarization with fuzzy sets. Expert Systemswith Applications, 61:356 – 377, 2016. (Cited on page 28.)

[45] Sarah Boyd. Trend: a system for generating intelligent descriptions oftime series data. In IEEE International Conference on Intelligent Pro-cessing Systems (ICIPS1998). Citeseer, 1998. (Cited on page 111.)

[46] João Mario Lopes Brezolin, Sandro Rama Fiorini, Marciade Borba Campos, and Rafael H Bordini. Using conceptual spaces forobject recognition in multi-agent systems. In International Conferenceon Principles and Practice of Multi-Agent Systems, pages 697–705.Springer, 2015. (Cited on page 25.)

[47] Gavin Brown. A new perspective for information theoretic feature selec-tion. In International conference on artificial intelligence and statistics,pages 49–56, 2009. (Cited on page 39.)

[48] Gavin Brown, Adam Pocock, Ming-Jie Zhao, and Mikel Luján. Con-ditional likelihood maximisation: a unifying framework for informa-tion theoretic feature selection. Journal of Machine Learning Research,13(Jan):27–66, 2012. (Cited on page 40.)

BIBLIOGRAPHY169

[40]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Categorizationofpatients’healthstatusincopddiseaseusingawearableplatformandrandomforestsmethodology.InProceed-ingsoftheIEEEInternationalConferenceonBiomedicalandHealthInformatics,pages404–407,2012.(Citedonpages95and98.)

[41]ChristosC.Bellos,AthanasiosPapadopoulos,RobertoRosso,andDim-itriosI.Fotiadis.Asupportvectormachineapproachforcategorizationofpatientssufferingfromchronicdiseases.InKonstantinaS.Nikita,JamesC.Lin,DimitriosI.Fotiadis,andMaria-TeresaArredondoWald-meyer,editors,WirelessMobileCommunicationandHealthcare,vol-ume83,pages264–267.SpringerBerlinHeidelberg,2012.(Citedonpages95and101.)

[42]YoshuaBengio,AaronCourville,andPascalVincent.Representationlearning:Areviewandnewperspectives.IEEEtransactionsonpatternanalysisandmachineintelligence,35(8):1798–1828,2013.(Citedonpage38.)

[43]AnnaMariaBianchi,MartinOswaldoMendez,andSergioCerutti.Pro-cessingofsignalsrecordedthroughsmartdevices:Sleep-qualityassess-ment.IEEETransactionsonInformationTechnologyinBiomedicine,14(3):741–747,2010.(Citedonpages95and100.)

[44]FatihEmreBoran,DiyarAkay,andRonaldR.Yager.Anoverviewofmethodsforlinguisticsummarizationwithfuzzysets.ExpertSystemswithApplications,61:356–377,2016.(Citedonpage28.)

[45]SarahBoyd.Trend:asystemforgeneratingintelligentdescriptionsoftimeseriesdata.InIEEEInternationalConferenceonIntelligentPro-cessingSystems(ICIPS1998).Citeseer,1998.(Citedonpage111.)

[46]JoãoMarioLopesBrezolin,SandroRamaFiorini,MarciadeBorbaCampos,andRafaelHBordini.Usingconceptualspacesforobjectrecognitioninmulti-agentsystems.InInternationalConferenceonPrinciplesandPracticeofMulti-AgentSystems,pages697–705.Springer,2015.(Citedonpage25.)

[47]GavinBrown.Anewperspectiveforinformationtheoreticfeatureselec-tion.InInternationalconferenceonartificialintelligenceandstatistics,pages49–56,2009.(Citedonpage39.)

[48]GavinBrown,AdamPocock,Ming-JieZhao,andMikelLuján.Con-ditionallikelihoodmaximisation:aunifyingframeworkforinforma-tiontheoreticfeatureselection.JournalofMachineLearningResearch,13(Jan):27–66,2012.(Citedonpage40.)

170 BIBLIOGRAPHY

[49] Rita M Catillo-Ortega, Nicolás Marín, and Daniel Sánchez. A fuzzyapproach to the linguistic summarization of time series. Journal ofMultiple-Valued Logic & Soft Computing, 17, 2011. (Cited on page28.)

[50] Gavin C Cawley and Nicola LC Talbot. Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition,36(11):2585–2592, 2003. (Cited on page 126.)

[51] Marie Chan, Daniel Estéve, Jean-Yves Fourniols, Christophe Escriba,and Eric Campo. Smart wearable systems: Current status and futurechallenges. Artificial Intelligence in Medicine, 56(3):137–156, Novem-ber 2012. (Cited on page 92.)

[52] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detec-tion: A survey. ACM Computing Surveys, 41(3):15:1–15:58, July 2009.(Cited on page 93.)

[53] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detec-tion for discrete sequences: A survey. IEEE Transactions on Knowledgeand Data Engineering, 24(5):823–839, May 2012. (Cited on page 93.)

[54] Sylvie Charbonnier and Sylviane Gentil. On-line adaptive trend extrac-tion of multiple physiological signals for alarm filtering in intensive careunits. International Journal of Adaptive Control and Signal Processing,pages 382–408, 2009. (Cited on pages 93, 94, and 96.)

[55] Samir Chatterjee, Kaushik Dutta, Harry (Qi) Xie, Jongbok Byun, AkshayPottathil, and Miles Moore. Persuasive and pervasive sensing: A newfrontier to monitor, track and assist older adults suffering from type-2diabetes. In Proceedings of the 2013 46th Hawaii International Con-ference on System Sciences, pages 2636–2645, Washington, DC, USA,2013. IEEE Computer Society. (Cited on pages 94, 98, 100, and 101.)

[56] Antonio Chella, Marcello Frixione, and Salvatore Gaglio. A cognitivearchitecture for artificial vision. Artificial Intelligence, 89(1-2):73–111,1997. (Cited on page 25.)

[57] Antonio Chella, Marcello Frixione, and Salvatore Gaglio. Anchoringsymbols to conceptual spaces: the case of dynamic scenarios. Roboticsand Autonomous Systems, 43(2):175–188, 2003. (Cited on page 25.)

[58] Hsinchun Chen, Sherrilynne S Fuller, Carol Friedman, and WilliamHersh. Medical informatics: knowledge management and data miningin biomedicine, volume 8. Springer Science & Business Media, 2006.(Cited on page 105.)

170BIBLIOGRAPHY

[49]RitaMCatillo-Ortega,NicolásMarín,andDanielSánchez.Afuzzyapproachtothelinguisticsummarizationoftimeseries.JournalofMultiple-ValuedLogic&SoftComputing,17,2011.(Citedonpage28.)

[50]GavinCCawleyandNicolaLCTalbot.Efficientleave-one-outcross-validationofkernelfisherdiscriminantclassifiers.PatternRecognition,36(11):2585–2592,2003.(Citedonpage126.)

[51]MarieChan,DanielEstéve,Jean-YvesFourniols,ChristopheEscriba,andEricCampo.Smartwearablesystems:Currentstatusandfuturechallenges.ArtificialIntelligenceinMedicine,56(3):137–156,Novem-ber2012.(Citedonpage92.)

[52]VarunChandola,ArindamBanerjee,andVipinKumar.Anomalydetec-tion:Asurvey.ACMComputingSurveys,41(3):15:1–15:58,July2009.(Citedonpage93.)

[53]VarunChandola,ArindamBanerjee,andVipinKumar.Anomalydetec-tionfordiscretesequences:Asurvey.IEEETransactionsonKnowledgeandDataEngineering,24(5):823–839,May2012.(Citedonpage93.)

[54]SylvieCharbonnierandSylvianeGentil.On-lineadaptivetrendextrac-tionofmultiplephysiologicalsignalsforalarmfilteringinintensivecareunits.InternationalJournalofAdaptiveControlandSignalProcessing,pages382–408,2009.(Citedonpages93,94,and96.)

[55]SamirChatterjee,KaushikDutta,Harry(Qi)Xie,JongbokByun,AkshayPottathil,andMilesMoore.Persuasiveandpervasivesensing:Anewfrontiertomonitor,trackandassistolderadultssufferingfromtype-2diabetes.InProceedingsofthe201346thHawaiiInternationalCon-ferenceonSystemSciences,pages2636–2645,Washington,DC,USA,2013.IEEEComputerSociety.(Citedonpages94,98,100,and101.)

[56]AntonioChella,MarcelloFrixione,andSalvatoreGaglio.Acognitivearchitectureforartificialvision.ArtificialIntelligence,89(1-2):73–111,1997.(Citedonpage25.)

[57]AntonioChella,MarcelloFrixione,andSalvatoreGaglio.Anchoringsymbolstoconceptualspaces:thecaseofdynamicscenarios.RoboticsandAutonomousSystems,43(2):175–188,2003.(Citedonpage25.)

[58]HsinchunChen,SherrilynneSFuller,CarolFriedman,andWilliamHersh.Medicalinformatics:knowledgemanagementanddatamininginbiomedicine,volume8.SpringerScience&BusinessMedia,2006.(Citedonpage105.)

170 BIBLIOGRAPHY

[49] Rita M Catillo-Ortega, Nicolás Marín, and Daniel Sánchez. A fuzzyapproach to the linguistic summarization of time series. Journal ofMultiple-Valued Logic & Soft Computing, 17, 2011. (Cited on page28.)

[50] Gavin C Cawley and Nicola LC Talbot. Efficient leave-one-out cross-validation of kernel fisher discriminant classifiers. Pattern Recognition,36(11):2585–2592, 2003. (Cited on page 126.)

[51] Marie Chan, Daniel Estéve, Jean-Yves Fourniols, Christophe Escriba,and Eric Campo. Smart wearable systems: Current status and futurechallenges. Artificial Intelligence in Medicine, 56(3):137–156, Novem-ber 2012. (Cited on page 92.)

[52] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detec-tion: A survey. ACM Computing Surveys, 41(3):15:1–15:58, July 2009.(Cited on page 93.)

[53] Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detec-tion for discrete sequences: A survey. IEEE Transactions on Knowledgeand Data Engineering, 24(5):823–839, May 2012. (Cited on page 93.)

[54] Sylvie Charbonnier and Sylviane Gentil. On-line adaptive trend extrac-tion of multiple physiological signals for alarm filtering in intensive careunits. International Journal of Adaptive Control and Signal Processing,pages 382–408, 2009. (Cited on pages 93, 94, and 96.)

[55] Samir Chatterjee, Kaushik Dutta, Harry (Qi) Xie, Jongbok Byun, AkshayPottathil, and Miles Moore. Persuasive and pervasive sensing: A newfrontier to monitor, track and assist older adults suffering from type-2diabetes. In Proceedings of the 2013 46th Hawaii International Con-ference on System Sciences, pages 2636–2645, Washington, DC, USA,2013. IEEE Computer Society. (Cited on pages 94, 98, 100, and 101.)

[56] Antonio Chella, Marcello Frixione, and Salvatore Gaglio. A cognitivearchitecture for artificial vision. Artificial Intelligence, 89(1-2):73–111,1997. (Cited on page 25.)

[57] Antonio Chella, Marcello Frixione, and Salvatore Gaglio. Anchoringsymbols to conceptual spaces: the case of dynamic scenarios. Roboticsand Autonomous Systems, 43(2):175–188, 2003. (Cited on page 25.)

[58] Hsinchun Chen, Sherrilynne S Fuller, Carol Friedman, and WilliamHersh. Medical informatics: knowledge management and data miningin biomedicine, volume 8. Springer Science & Business Media, 2006.(Cited on page 105.)

170BIBLIOGRAPHY

[49]RitaMCatillo-Ortega,NicolásMarín,andDanielSánchez.Afuzzyapproachtothelinguisticsummarizationoftimeseries.JournalofMultiple-ValuedLogic&SoftComputing,17,2011.(Citedonpage28.)

[50]GavinCCawleyandNicolaLCTalbot.Efficientleave-one-outcross-validationofkernelfisherdiscriminantclassifiers.PatternRecognition,36(11):2585–2592,2003.(Citedonpage126.)

[51]MarieChan,DanielEstéve,Jean-YvesFourniols,ChristopheEscriba,andEricCampo.Smartwearablesystems:Currentstatusandfuturechallenges.ArtificialIntelligenceinMedicine,56(3):137–156,Novem-ber2012.(Citedonpage92.)

[52]VarunChandola,ArindamBanerjee,andVipinKumar.Anomalydetec-tion:Asurvey.ACMComputingSurveys,41(3):15:1–15:58,July2009.(Citedonpage93.)

[53]VarunChandola,ArindamBanerjee,andVipinKumar.Anomalydetec-tionfordiscretesequences:Asurvey.IEEETransactionsonKnowledgeandDataEngineering,24(5):823–839,May2012.(Citedonpage93.)

[54]SylvieCharbonnierandSylvianeGentil.On-lineadaptivetrendextrac-tionofmultiplephysiologicalsignalsforalarmfilteringinintensivecareunits.InternationalJournalofAdaptiveControlandSignalProcessing,pages382–408,2009.(Citedonpages93,94,and96.)

[55]SamirChatterjee,KaushikDutta,Harry(Qi)Xie,JongbokByun,AkshayPottathil,andMilesMoore.Persuasiveandpervasivesensing:Anewfrontiertomonitor,trackandassistolderadultssufferingfromtype-2diabetes.InProceedingsofthe201346thHawaiiInternationalCon-ferenceonSystemSciences,pages2636–2645,Washington,DC,USA,2013.IEEEComputerSociety.(Citedonpages94,98,100,and101.)

[56]AntonioChella,MarcelloFrixione,andSalvatoreGaglio.Acognitivearchitectureforartificialvision.ArtificialIntelligence,89(1-2):73–111,1997.(Citedonpage25.)

[57]AntonioChella,MarcelloFrixione,andSalvatoreGaglio.Anchoringsymbolstoconceptualspaces:thecaseofdynamicscenarios.RoboticsandAutonomousSystems,43(2):175–188,2003.(Citedonpage25.)

[58]HsinchunChen,SherrilynneSFuller,CarolFriedman,andWilliamHersh.Medicalinformatics:knowledgemanagementanddatamininginbiomedicine,volume8.SpringerScience&BusinessMedia,2006.(Citedonpage105.)

BIBLIOGRAPHY 171

[59] Yueguo Chen, Mario A Nascimento, Beng Chin Ooi, and Anthony KHTung. Spade: On shape-based pattern detection in streaming time series.In IEEE 23rd International Conference on Data Engineering (ICDE)2007, pages 786–795. IEEE, 2007. (Cited on page 124.)

[60] Jongyoon Choi, Beena Ahmed, and Ricardo Gutierrez-Osuna. Develop-ment and evaluation of an ambulatory stress monitor based on wearablesensors. IEEE transactions on information technology in biomedicine,16(2):279–286, 2012. (Cited on pages 94, 96, and 97.)

[61] Lei Clifton, David A Clifton, Marco AF Pimentel, Peter J Watkinson, andLionel Tarassenko. Gaussian processes for personalized e-health moni-toring with wearable sensors. IEEE Transactions on Biomedical Engi-neering, 60(1):193–197, 2013. (Cited on pages 93, 98, 99, and 100.)

[62] Carlo Combi and Alberto Sabaini. Extraction, analysis, and visualiza-tion of temporal association rules from interval-based clinical data. InConference on Artificial Intelligence in Medicine in Europe, pages 238–247. Springer, 2013. (Cited on page 121.)

[63] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Ma-chine learning, 20(3):273–297, 1995. (Cited on page 98.)

[64] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, andJosé Reis. Modeling wine preferences by data mining from physicochem-ical properties. Decision Support Systems, 47(4):547–553, 2009. (Citedon page 70.)

[65] Thomas M Cover and Joy A Thomas. Elements of information theory2nd edition. 2006. (Cited on page 40.)

[66] Richard Cubek, Wolfgang Ertel, and Günther Palm. High-level learningfrom demonstration with conceptual spaces and subspace clustering. InIEEE International Conference on Robotics and Automation (ICRA)2015, pages 2592–2597. IEEE, 2015. (Cited on page 25.)

[67] Gautam Das, King-Ip Lin, Heikki Mannila, Gopal Renganathan, andPadhraic Smyth. Rule discovery from time series. In Proceedings ofthe Fourth International Conference on Knowledge Discovery and DataMining KDD’98, pages 16–22, 1998. (Cited on page 121.)

[68] Donald Davidson. Truth and meaning. synthese, 17(1):304–323, 1967.(Cited on page 20.)

[69] Lieven Decock and Igor Douven. What is graded membership? Noûs,48(4):653–682, 2014. (Cited on page 62.)

BIBLIOGRAPHY171

[59]YueguoChen,MarioANascimento,BengChinOoi,andAnthonyKHTung.Spade:Onshape-basedpatterndetectioninstreamingtimeseries.InIEEE23rdInternationalConferenceonDataEngineering(ICDE)2007,pages786–795.IEEE,2007.(Citedonpage124.)

[60]JongyoonChoi,BeenaAhmed,andRicardoGutierrez-Osuna.Develop-mentandevaluationofanambulatorystressmonitorbasedonwearablesensors.IEEEtransactionsoninformationtechnologyinbiomedicine,16(2):279–286,2012.(Citedonpages94,96,and97.)

[61]LeiClifton,DavidAClifton,MarcoAFPimentel,PeterJWatkinson,andLionelTarassenko.Gaussianprocessesforpersonalizede-healthmoni-toringwithwearablesensors.IEEETransactionsonBiomedicalEngi-neering,60(1):193–197,2013.(Citedonpages93,98,99,and100.)

[62]CarloCombiandAlbertoSabaini.Extraction,analysis,andvisualiza-tionoftemporalassociationrulesfrominterval-basedclinicaldata.InConferenceonArtificialIntelligenceinMedicineinEurope,pages238–247.Springer,2013.(Citedonpage121.)

[63]CorinnaCortesandVladimirVapnik.Support-vectornetworks.Ma-chinelearning,20(3):273–297,1995.(Citedonpage98.)

[64]PauloCortez,AntónioCerdeira,FernandoAlmeida,TelmoMatos,andJoséReis.Modelingwinepreferencesbydataminingfromphysicochem-icalproperties.DecisionSupportSystems,47(4):547–553,2009.(Citedonpage70.)

[65]ThomasMCoverandJoyAThomas.Elementsofinformationtheory2ndedition.2006.(Citedonpage40.)

[66]RichardCubek,WolfgangErtel,andGüntherPalm.High-levellearningfromdemonstrationwithconceptualspacesandsubspaceclustering.InIEEEInternationalConferenceonRoboticsandAutomation(ICRA)2015,pages2592–2597.IEEE,2015.(Citedonpage25.)

[67]GautamDas,King-IpLin,HeikkiMannila,GopalRenganathan,andPadhraicSmyth.Rulediscoveryfromtimeseries.InProceedingsoftheFourthInternationalConferenceonKnowledgeDiscoveryandDataMiningKDD’98,pages16–22,1998.(Citedonpage121.)

[68]DonaldDavidson.Truthandmeaning.synthese,17(1):304–323,1967.(Citedonpage20.)

[69]LievenDecockandIgorDouven.Whatisgradedmembership?Noûs,48(4):653–682,2014.(Citedonpage62.)

BIBLIOGRAPHY 171

[59] Yueguo Chen, Mario A Nascimento, Beng Chin Ooi, and Anthony KHTung. Spade: On shape-based pattern detection in streaming time series.In IEEE 23rd International Conference on Data Engineering (ICDE)2007, pages 786–795. IEEE, 2007. (Cited on page 124.)

[60] Jongyoon Choi, Beena Ahmed, and Ricardo Gutierrez-Osuna. Develop-ment and evaluation of an ambulatory stress monitor based on wearablesensors. IEEE transactions on information technology in biomedicine,16(2):279–286, 2012. (Cited on pages 94, 96, and 97.)

[61] Lei Clifton, David A Clifton, Marco AF Pimentel, Peter J Watkinson, andLionel Tarassenko. Gaussian processes for personalized e-health moni-toring with wearable sensors. IEEE Transactions on Biomedical Engi-neering, 60(1):193–197, 2013. (Cited on pages 93, 98, 99, and 100.)

[62] Carlo Combi and Alberto Sabaini. Extraction, analysis, and visualiza-tion of temporal association rules from interval-based clinical data. InConference on Artificial Intelligence in Medicine in Europe, pages 238–247. Springer, 2013. (Cited on page 121.)

[63] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Ma-chine learning, 20(3):273–297, 1995. (Cited on page 98.)

[64] Paulo Cortez, António Cerdeira, Fernando Almeida, Telmo Matos, andJosé Reis. Modeling wine preferences by data mining from physicochem-ical properties. Decision Support Systems, 47(4):547–553, 2009. (Citedon page 70.)

[65] Thomas M Cover and Joy A Thomas. Elements of information theory2nd edition. 2006. (Cited on page 40.)

[66] Richard Cubek, Wolfgang Ertel, and Günther Palm. High-level learningfrom demonstration with conceptual spaces and subspace clustering. InIEEE International Conference on Robotics and Automation (ICRA)2015, pages 2592–2597. IEEE, 2015. (Cited on page 25.)

[67] Gautam Das, King-Ip Lin, Heikki Mannila, Gopal Renganathan, andPadhraic Smyth. Rule discovery from time series. In Proceedings ofthe Fourth International Conference on Knowledge Discovery and DataMining KDD’98, pages 16–22, 1998. (Cited on page 121.)

[68] Donald Davidson. Truth and meaning. synthese, 17(1):304–323, 1967.(Cited on page 20.)

[69] Lieven Decock and Igor Douven. What is graded membership? Noûs,48(4):653–682, 2014. (Cited on page 62.)

BIBLIOGRAPHY171

[59]YueguoChen,MarioANascimento,BengChinOoi,andAnthonyKHTung.Spade:Onshape-basedpatterndetectioninstreamingtimeseries.InIEEE23rdInternationalConferenceonDataEngineering(ICDE)2007,pages786–795.IEEE,2007.(Citedonpage124.)

[60]JongyoonChoi,BeenaAhmed,andRicardoGutierrez-Osuna.Develop-mentandevaluationofanambulatorystressmonitorbasedonwearablesensors.IEEEtransactionsoninformationtechnologyinbiomedicine,16(2):279–286,2012.(Citedonpages94,96,and97.)

[61]LeiClifton,DavidAClifton,MarcoAFPimentel,PeterJWatkinson,andLionelTarassenko.Gaussianprocessesforpersonalizede-healthmoni-toringwithwearablesensors.IEEETransactionsonBiomedicalEngi-neering,60(1):193–197,2013.(Citedonpages93,98,99,and100.)

[62]CarloCombiandAlbertoSabaini.Extraction,analysis,andvisualiza-tionoftemporalassociationrulesfrominterval-basedclinicaldata.InConferenceonArtificialIntelligenceinMedicineinEurope,pages238–247.Springer,2013.(Citedonpage121.)

[63]CorinnaCortesandVladimirVapnik.Support-vectornetworks.Ma-chinelearning,20(3):273–297,1995.(Citedonpage98.)

[64]PauloCortez,AntónioCerdeira,FernandoAlmeida,TelmoMatos,andJoséReis.Modelingwinepreferencesbydataminingfromphysicochem-icalproperties.DecisionSupportSystems,47(4):547–553,2009.(Citedonpage70.)

[65]ThomasMCoverandJoyAThomas.Elementsofinformationtheory2ndedition.2006.(Citedonpage40.)

[66]RichardCubek,WolfgangErtel,andGüntherPalm.High-levellearningfromdemonstrationwithconceptualspacesandsubspaceclustering.InIEEEInternationalConferenceonRoboticsandAutomation(ICRA)2015,pages2592–2597.IEEE,2015.(Citedonpage25.)

[67]GautamDas,King-IpLin,HeikkiMannila,GopalRenganathan,andPadhraicSmyth.Rulediscoveryfromtimeseries.InProceedingsoftheFourthInternationalConferenceonKnowledgeDiscoveryandDataMiningKDD’98,pages16–22,1998.(Citedonpage121.)

[68]DonaldDavidson.Truthandmeaning.synthese,17(1):304–323,1967.(Citedonpage20.)

[69]LievenDecockandIgorDouven.Whatisgradedmembership?Noûs,48(4):653–682,2014.(Citedonpage62.)

172 BIBLIOGRAPHY

[70] Miguel Delgado, M Dolores Ruiz, Daniel Sánchez, and M Amparo Vila.Fuzzy quantification: a state of the art. Fuzzy Sets and Systems, 242:1–30, 2014. (Cited on pages 27 and 28.)

[71] Anne M Denton, Christopher A Besemann, and Dietmar H Dorr.Pattern-based time-series subsequence clustering using radial distribu-tion functions. Knowledge and Information Systems, 18(1):1–27, 2009.(Cited on page 114.)

[72] Joaquín Derrac and Steven Schockaert. Inducing semantic relations fromconceptual spaces: a data-driven approach to plausible reasoning. Arti-ficial Intelligence, 228:66–94, 2015. (Cited on page 33.)

[73] Félix Díaz-Hermida, Martin Pereira-Fariña, Juan Carlos Vidal, and Ale-jandro Ramos-Soto. Characterizing quantifier fuzzification mechanisms:A behavioral guide for applications. Fuzzy Sets and Systems, 2017.(Cited on page 27.)

[74] Hao Ding, Hong Sun, and Kun mean Hou. Abnormal ecg signal detec-tion based on compressed sampling in wearable ecg sensor. In Interna-tional Conference on Wireless Communications and Signal Processing,pages 1–5, 2011. (Cited on pages 98, 99, and 101.)

[75] Włodzisław Duch. Filter methods. In Feature Extraction, pages 89–117.Springer, 2006. (Cited on page 38.)

[76] Damian Dudek. Measures for comparing association rule sets. In Inter-national Conference on Artificial Intelligence and Soft Computing, pages315–322. Springer, 2010. (Cited on page 124.)

[77] Alireza Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. Describ-ing objects by their attributes. In Computer Vision and Pattern Recog-nition, pages 1778–1785, June 2009. (Cited on page 159.)

[78] Yansong Feng and Mirella Lapata. Visual information in semantic rep-resentation. In Human Language Technologies: The 2010 Annual Con-ference of the North American Chapter of the Association for Computa-tional Linguistics, pages 91–99. Association for Computational Linguis-tics, 2010. (Cited on page 19.)

[79] Leo Ferres, Avi Parush, Shelley Roberts, and Gitte Lindgaard. Help-ing people with visual impairments gain access to graphical informa-tion through natural language: The igraph system. In InternationalConference on Computers for Handicapped Persons, pages 1122–1130.Springer, 2006. (Cited on page 31.)

172BIBLIOGRAPHY

[70]MiguelDelgado,MDoloresRuiz,DanielSánchez,andMAmparoVila.Fuzzyquantification:astateoftheart.FuzzySetsandSystems,242:1–30,2014.(Citedonpages27and28.)

[71]AnneMDenton,ChristopherABesemann,andDietmarHDorr.Pattern-basedtime-seriessubsequenceclusteringusingradialdistribu-tionfunctions.KnowledgeandInformationSystems,18(1):1–27,2009.(Citedonpage114.)

[72]JoaquínDerracandStevenSchockaert.Inducingsemanticrelationsfromconceptualspaces:adata-drivenapproachtoplausiblereasoning.Arti-ficialIntelligence,228:66–94,2015.(Citedonpage33.)

[73]FélixDíaz-Hermida,MartinPereira-Fariña,JuanCarlosVidal,andAle-jandroRamos-Soto.Characterizingquantifierfuzzificationmechanisms:Abehavioralguideforapplications.FuzzySetsandSystems,2017.(Citedonpage27.)

[74]HaoDing,HongSun,andKunmeanHou.Abnormalecgsignaldetec-tionbasedoncompressedsamplinginwearableecgsensor.InInterna-tionalConferenceonWirelessCommunicationsandSignalProcessing,pages1–5,2011.(Citedonpages98,99,and101.)

[75]WłodzisławDuch.Filtermethods.InFeatureExtraction,pages89–117.Springer,2006.(Citedonpage38.)

[76]DamianDudek.Measuresforcomparingassociationrulesets.InInter-nationalConferenceonArtificialIntelligenceandSoftComputing,pages315–322.Springer,2010.(Citedonpage124.)

[77]AlirezaFarhadi,IanEndres,DerekHoiem,andDavidForsyth.Describ-ingobjectsbytheirattributes.InComputerVisionandPatternRecog-nition,pages1778–1785,June2009.(Citedonpage159.)

[78]YansongFengandMirellaLapata.Visualinformationinsemanticrep-resentation.InHumanLanguageTechnologies:The2010AnnualCon-ferenceoftheNorthAmericanChapteroftheAssociationforComputa-tionalLinguistics,pages91–99.AssociationforComputationalLinguis-tics,2010.(Citedonpage19.)

[79]LeoFerres,AviParush,ShelleyRoberts,andGitteLindgaard.Help-ingpeoplewithvisualimpairmentsgainaccesstographicalinforma-tionthroughnaturallanguage:Theigraphsystem.InInternationalConferenceonComputersforHandicappedPersons,pages1122–1130.Springer,2006.(Citedonpage31.)

172 BIBLIOGRAPHY

[70] Miguel Delgado, M Dolores Ruiz, Daniel Sánchez, and M Amparo Vila.Fuzzy quantification: a state of the art. Fuzzy Sets and Systems, 242:1–30, 2014. (Cited on pages 27 and 28.)

[71] Anne M Denton, Christopher A Besemann, and Dietmar H Dorr.Pattern-based time-series subsequence clustering using radial distribu-tion functions. Knowledge and Information Systems, 18(1):1–27, 2009.(Cited on page 114.)

[72] Joaquín Derrac and Steven Schockaert. Inducing semantic relations fromconceptual spaces: a data-driven approach to plausible reasoning. Arti-ficial Intelligence, 228:66–94, 2015. (Cited on page 33.)

[73] Félix Díaz-Hermida, Martin Pereira-Fariña, Juan Carlos Vidal, and Ale-jandro Ramos-Soto. Characterizing quantifier fuzzification mechanisms:A behavioral guide for applications. Fuzzy Sets and Systems, 2017.(Cited on page 27.)

[74] Hao Ding, Hong Sun, and Kun mean Hou. Abnormal ecg signal detec-tion based on compressed sampling in wearable ecg sensor. In Interna-tional Conference on Wireless Communications and Signal Processing,pages 1–5, 2011. (Cited on pages 98, 99, and 101.)

[75] Włodzisław Duch. Filter methods. In Feature Extraction, pages 89–117.Springer, 2006. (Cited on page 38.)

[76] Damian Dudek. Measures for comparing association rule sets. In Inter-national Conference on Artificial Intelligence and Soft Computing, pages315–322. Springer, 2010. (Cited on page 124.)

[77] Alireza Farhadi, Ian Endres, Derek Hoiem, and David Forsyth. Describ-ing objects by their attributes. In Computer Vision and Pattern Recog-nition, pages 1778–1785, June 2009. (Cited on page 159.)

[78] Yansong Feng and Mirella Lapata. Visual information in semantic rep-resentation. In Human Language Technologies: The 2010 Annual Con-ference of the North American Chapter of the Association for Computa-tional Linguistics, pages 91–99. Association for Computational Linguis-tics, 2010. (Cited on page 19.)

[79] Leo Ferres, Avi Parush, Shelley Roberts, and Gitte Lindgaard. Help-ing people with visual impairments gain access to graphical informa-tion through natural language: The igraph system. In InternationalConference on Computers for Handicapped Persons, pages 1122–1130.Springer, 2006. (Cited on page 31.)

172BIBLIOGRAPHY

[70]MiguelDelgado,MDoloresRuiz,DanielSánchez,andMAmparoVila.Fuzzyquantification:astateoftheart.FuzzySetsandSystems,242:1–30,2014.(Citedonpages27and28.)

[71]AnneMDenton,ChristopherABesemann,andDietmarHDorr.Pattern-basedtime-seriessubsequenceclusteringusingradialdistribu-tionfunctions.KnowledgeandInformationSystems,18(1):1–27,2009.(Citedonpage114.)

[72]JoaquínDerracandStevenSchockaert.Inducingsemanticrelationsfromconceptualspaces:adata-drivenapproachtoplausiblereasoning.Arti-ficialIntelligence,228:66–94,2015.(Citedonpage33.)

[73]FélixDíaz-Hermida,MartinPereira-Fariña,JuanCarlosVidal,andAle-jandroRamos-Soto.Characterizingquantifierfuzzificationmechanisms:Abehavioralguideforapplications.FuzzySetsandSystems,2017.(Citedonpage27.)

[74]HaoDing,HongSun,andKunmeanHou.Abnormalecgsignaldetec-tionbasedoncompressedsamplinginwearableecgsensor.InInterna-tionalConferenceonWirelessCommunicationsandSignalProcessing,pages1–5,2011.(Citedonpages98,99,and101.)

[75]WłodzisławDuch.Filtermethods.InFeatureExtraction,pages89–117.Springer,2006.(Citedonpage38.)

[76]DamianDudek.Measuresforcomparingassociationrulesets.InInter-nationalConferenceonArtificialIntelligenceandSoftComputing,pages315–322.Springer,2010.(Citedonpage124.)

[77]AlirezaFarhadi,IanEndres,DerekHoiem,andDavidForsyth.Describ-ingobjectsbytheirattributes.InComputerVisionandPatternRecog-nition,pages1778–1785,June2009.(Citedonpage159.)

[78]YansongFengandMirellaLapata.Visualinformationinsemanticrep-resentation.InHumanLanguageTechnologies:The2010AnnualCon-ferenceoftheNorthAmericanChapteroftheAssociationforComputa-tionalLinguistics,pages91–99.AssociationforComputationalLinguis-tics,2010.(Citedonpage19.)

[79]LeoFerres,AviParush,ShelleyRoberts,andGitteLindgaard.Help-ingpeoplewithvisualimpairmentsgainaccesstographicalinforma-tionthroughnaturallanguage:Theigraphsystem.InInternationalConferenceonComputersforHandicappedPersons,pages1122–1130.Springer,2006.(Citedonpage31.)

BIBLIOGRAPHY 173

[80] François Fleuret. Fast binary feature selection with conditional mu-tual information. Journal of Machine Learning Research, 5(Nov):1531–1555, 2004. (Cited on page 40.)

[81] Christos A Frantzidis, Charalampos Bratsas, Manousos A Klados, Ev-dokimos Konstantinidis, Chrysa D Lithari, Ana B Vivas, Christos L Pa-padelis, Eleni Kaldoudi, Costas Pappas, and Panagiotis D Bamidis. Onthe classification of emotional biosignals evoked while viewing affectivepictures: an integrated data-mining-based approach for healthcare appli-cations. IEEE Transactions on Information Technology in Biomedicine,14(2):309–318, 2010. (Cited on pages 95, 96, 97, and 98.)

[82] Tak-chung Fu. A review on time series data mining. Engineering Appli-cations of Artificial Intelligence, 24(1):164–181, 2011. (Cited on pages110 and 113.)

[83] Ben D Fulcher and Nick S Jones. Highly comparative feature-based time-series classification. IEEE Transactions on Knowledge and Data Engi-neering, 26(12):3026–3037, 2014. (Cited on page 141.)

[84] Peter Gärdenfors. Three levels of inductive inference. Studies in Logicand the Foundations of Mathematics, 134:427–449, 1995. (Cited onpage 3.)

[85] Peter Gärdenfors. Symbolic, conceptual and subconceptual representa-tions. In Human and Machine Perception, pages 255–270. Springer,1997. (Cited on page 3.)

[86] Peter Gärdenfors. Conceptual spaces: The geometry of thought. MITpress, 2000. (Cited on pages 3, 4, 20, 21, 22, 23, 24, 25, 42, 44, 45,47, 49, 50, 51, 66, 82, and 148.)

[87] Peter Gardenfors. Conceptual spaces as a framework for knowledgerepresentation. Mind and Matter, 2(2):9–27, 2004. (Cited on pages 3,21, 23, 24, 25, 35, and 38.)

[88] Peter Gärdenfors. Induction, conceptual spaces and ai. The Dynamics ofThought, pages 109–124, 2005. (Cited on pages 25, 51, 53, and 161.)

[89] Peter Gärdenfors. Semantics based on conceptual spaces. In IndianConference on Logic and Its Applications, pages 1–11. Springer, 2011.(Cited on pages 3, 23, and 25.)

[90] Peter Gärdenfors. The geometry of meaning: Semantics based on con-ceptual spaces. MIT Press, 2014. (Cited on pages 20 and 47.)

[91] Peter Gärdenfors and Massimo Warglien. Using conceptual spaces tomodel actions and events. Journal of Semantics, page ffs007, 2012.(Cited on page 45.)

BIBLIOGRAPHY173

[80]FrançoisFleuret.Fastbinaryfeatureselectionwithconditionalmu-tualinformation.JournalofMachineLearningResearch,5(Nov):1531–1555,2004.(Citedonpage40.)

[81]ChristosAFrantzidis,CharalamposBratsas,ManousosAKlados,Ev-dokimosKonstantinidis,ChrysaDLithari,AnaBVivas,ChristosLPa-padelis,EleniKaldoudi,CostasPappas,andPanagiotisDBamidis.Ontheclassificationofemotionalbiosignalsevokedwhileviewingaffectivepictures:anintegrateddata-mining-basedapproachforhealthcareappli-cations.IEEETransactionsonInformationTechnologyinBiomedicine,14(2):309–318,2010.(Citedonpages95,96,97,and98.)

[82]Tak-chungFu.Areviewontimeseriesdatamining.EngineeringAppli-cationsofArtificialIntelligence,24(1):164–181,2011.(Citedonpages110and113.)

[83]BenDFulcherandNickSJones.Highlycomparativefeature-basedtime-seriesclassification.IEEETransactionsonKnowledgeandDataEngi-neering,26(12):3026–3037,2014.(Citedonpage141.)

[84]PeterGärdenfors.Threelevelsofinductiveinference.StudiesinLogicandtheFoundationsofMathematics,134:427–449,1995.(Citedonpage3.)

[85]PeterGärdenfors.Symbolic,conceptualandsubconceptualrepresenta-tions.InHumanandMachinePerception,pages255–270.Springer,1997.(Citedonpage3.)

[86]PeterGärdenfors.Conceptualspaces:Thegeometryofthought.MITpress,2000.(Citedonpages3,4,20,21,22,23,24,25,42,44,45,47,49,50,51,66,82,and148.)

[87]PeterGardenfors.Conceptualspacesasaframeworkforknowledgerepresentation.MindandMatter,2(2):9–27,2004.(Citedonpages3,21,23,24,25,35,and38.)

[88]PeterGärdenfors.Induction,conceptualspacesandai.TheDynamicsofThought,pages109–124,2005.(Citedonpages25,51,53,and161.)

[89]PeterGärdenfors.Semanticsbasedonconceptualspaces.InIndianConferenceonLogicandItsApplications,pages1–11.Springer,2011.(Citedonpages3,23,and25.)

[90]PeterGärdenfors.Thegeometryofmeaning:Semanticsbasedoncon-ceptualspaces.MITPress,2014.(Citedonpages20and47.)

[91]PeterGärdenforsandMassimoWarglien.Usingconceptualspacestomodelactionsandevents.JournalofSemantics,pageffs007,2012.(Citedonpage45.)

BIBLIOGRAPHY 173

[80] François Fleuret. Fast binary feature selection with conditional mu-tual information. Journal of Machine Learning Research, 5(Nov):1531–1555, 2004. (Cited on page 40.)

[81] Christos A Frantzidis, Charalampos Bratsas, Manousos A Klados, Ev-dokimos Konstantinidis, Chrysa D Lithari, Ana B Vivas, Christos L Pa-padelis, Eleni Kaldoudi, Costas Pappas, and Panagiotis D Bamidis. Onthe classification of emotional biosignals evoked while viewing affectivepictures: an integrated data-mining-based approach for healthcare appli-cations. IEEE Transactions on Information Technology in Biomedicine,14(2):309–318, 2010. (Cited on pages 95, 96, 97, and 98.)

[82] Tak-chung Fu. A review on time series data mining. Engineering Appli-cations of Artificial Intelligence, 24(1):164–181, 2011. (Cited on pages110 and 113.)

[83] Ben D Fulcher and Nick S Jones. Highly comparative feature-based time-series classification. IEEE Transactions on Knowledge and Data Engi-neering, 26(12):3026–3037, 2014. (Cited on page 141.)

[84] Peter Gärdenfors. Three levels of inductive inference. Studies in Logicand the Foundations of Mathematics, 134:427–449, 1995. (Cited onpage 3.)

[85] Peter Gärdenfors. Symbolic, conceptual and subconceptual representa-tions. In Human and Machine Perception, pages 255–270. Springer,1997. (Cited on page 3.)

[86] Peter Gärdenfors. Conceptual spaces: The geometry of thought. MITpress, 2000. (Cited on pages 3, 4, 20, 21, 22, 23, 24, 25, 42, 44, 45,47, 49, 50, 51, 66, 82, and 148.)

[87] Peter Gardenfors. Conceptual spaces as a framework for knowledgerepresentation. Mind and Matter, 2(2):9–27, 2004. (Cited on pages 3,21, 23, 24, 25, 35, and 38.)

[88] Peter Gärdenfors. Induction, conceptual spaces and ai. The Dynamics ofThought, pages 109–124, 2005. (Cited on pages 25, 51, 53, and 161.)

[89] Peter Gärdenfors. Semantics based on conceptual spaces. In IndianConference on Logic and Its Applications, pages 1–11. Springer, 2011.(Cited on pages 3, 23, and 25.)

[90] Peter Gärdenfors. The geometry of meaning: Semantics based on con-ceptual spaces. MIT Press, 2014. (Cited on pages 20 and 47.)

[91] Peter Gärdenfors and Massimo Warglien. Using conceptual spaces tomodel actions and events. Journal of Semantics, page ffs007, 2012.(Cited on page 45.)

BIBLIOGRAPHY173

[80]FrançoisFleuret.Fastbinaryfeatureselectionwithconditionalmu-tualinformation.JournalofMachineLearningResearch,5(Nov):1531–1555,2004.(Citedonpage40.)

[81]ChristosAFrantzidis,CharalamposBratsas,ManousosAKlados,Ev-dokimosKonstantinidis,ChrysaDLithari,AnaBVivas,ChristosLPa-padelis,EleniKaldoudi,CostasPappas,andPanagiotisDBamidis.Ontheclassificationofemotionalbiosignalsevokedwhileviewingaffectivepictures:anintegrateddata-mining-basedapproachforhealthcareappli-cations.IEEETransactionsonInformationTechnologyinBiomedicine,14(2):309–318,2010.(Citedonpages95,96,97,and98.)

[82]Tak-chungFu.Areviewontimeseriesdatamining.EngineeringAppli-cationsofArtificialIntelligence,24(1):164–181,2011.(Citedonpages110and113.)

[83]BenDFulcherandNickSJones.Highlycomparativefeature-basedtime-seriesclassification.IEEETransactionsonKnowledgeandDataEngi-neering,26(12):3026–3037,2014.(Citedonpage141.)

[84]PeterGärdenfors.Threelevelsofinductiveinference.StudiesinLogicandtheFoundationsofMathematics,134:427–449,1995.(Citedonpage3.)

[85]PeterGärdenfors.Symbolic,conceptualandsubconceptualrepresenta-tions.InHumanandMachinePerception,pages255–270.Springer,1997.(Citedonpage3.)

[86]PeterGärdenfors.Conceptualspaces:Thegeometryofthought.MITpress,2000.(Citedonpages3,4,20,21,22,23,24,25,42,44,45,47,49,50,51,66,82,and148.)

[87]PeterGardenfors.Conceptualspacesasaframeworkforknowledgerepresentation.MindandMatter,2(2):9–27,2004.(Citedonpages3,21,23,24,25,35,and38.)

[88]PeterGärdenfors.Induction,conceptualspacesandai.TheDynamicsofThought,pages109–124,2005.(Citedonpages25,51,53,and161.)

[89]PeterGärdenfors.Semanticsbasedonconceptualspaces.InIndianConferenceonLogicandItsApplications,pages1–11.Springer,2011.(Citedonpages3,23,and25.)

[90]PeterGärdenfors.Thegeometryofmeaning:Semanticsbasedoncon-ceptualspaces.MITPress,2014.(Citedonpages20and47.)

[91]PeterGärdenforsandMassimoWarglien.Usingconceptualspacestomodelactionsandevents.JournalofSemantics,pageffs007,2012.(Citedonpage45.)

174 BIBLIOGRAPHY

[92] Peter Gärdenfors and Mary-Anne Williams. Reasoning about categoriesin conceptual spaces. In Proceedings of the 17th international joint con-ference on Artificial, IJCAI’01, pages 385–392, 2001. (Cited on pages25, 50, and 62.)

[93] Albert Gatt and Emiel Krahmer. Survey of the state of the art in naturallanguage generation: Core tasks, applications and evaluation. Journal ofArtificial Intelligence Research, 61:65–170, 2018. (Cited on pages 30,31, and 32.)

[94] Albert Gatt, Francois Portet, Ehud Reiter, Jim Hunter, Saad Mahamood,Wendy Moncur, and Somayajulu Sripada. From data to text in theneonatal intensive care unit: Using nlg technology for decision supportand information management. Ai Communications, 22(3):153–186,2009. (Cited on pages 31 and 32.)

[95] Albert Gatt and Ehud Reiter. Simplenlg: A realisation engine for prac-tical applications. In Proceedings of the 12th European Workshop onNatural Language Generation, pages 90–93. Association for Computa-tional Linguistics, 2009. (Cited on page 70.)

[96] Albert Gatt, Roger PG van Gompel, Emiel Krahmer, and Kees vanDeemter. Does domain size impact speech onset time during referenceproduction? In 34th Annual Meeting of the Cognitive Science Society,pages 1584–1589, 2012. (Cited on page 159.)

[97] Elena Gaura, John Kemp, and James Brusey. Leveraging knowledge fromphysiological data: On-body heat stress risk prediction with sensor net-works. IEEE transactions on biomedical circuits and systems, 7(6):861–870, 2013. (Cited on pages 93, 94, 98, and 99.)

[98] Ronald L Gellish, Brian R Goslin, Ronald E Olson, AUDRY McDON-ALD, Gary D Russi, and Virinder K Moudgil. Longitudinal modelingof the relationship between age and maximal heart rate. Medicine andscience in sports and exercise, 39(5):822–829, 2007. (Cited on page95.)

[99] James E Gentle, Wolfgang Karl Härdle, and Yuichi Mori. Handbookof computational statistics: concepts and methods. Springer Science &Business Media, 2012. (Cited on page 109.)

[100] John Gialelis, Petros Chondros, Dimitrios Karadimas, Sofia Dima, andDimitrios Serpanos. Identifying chronic disease complications utilizingstate of the art data fusion methodologies and signal processing algo-rithms. In Konstantina S. Nikita, James C. Lin, Dimitrios I. Fotiadis,

174BIBLIOGRAPHY

[92]PeterGärdenforsandMary-AnneWilliams.Reasoningaboutcategoriesinconceptualspaces.InProceedingsofthe17thinternationaljointcon-ferenceonArtificial,IJCAI’01,pages385–392,2001.(Citedonpages25,50,and62.)

[93]AlbertGattandEmielKrahmer.Surveyofthestateoftheartinnaturallanguagegeneration:Coretasks,applicationsandevaluation.JournalofArtificialIntelligenceResearch,61:65–170,2018.(Citedonpages30,31,and32.)

[94]AlbertGatt,FrancoisPortet,EhudReiter,JimHunter,SaadMahamood,WendyMoncur,andSomayajuluSripada.Fromdatatotextintheneonatalintensivecareunit:Usingnlgtechnologyfordecisionsupportandinformationmanagement.AiCommunications,22(3):153–186,2009.(Citedonpages31and32.)

[95]AlbertGattandEhudReiter.Simplenlg:Arealisationengineforprac-ticalapplications.InProceedingsofthe12thEuropeanWorkshoponNaturalLanguageGeneration,pages90–93.AssociationforComputa-tionalLinguistics,2009.(Citedonpage70.)

[96]AlbertGatt,RogerPGvanGompel,EmielKrahmer,andKeesvanDeemter.Doesdomainsizeimpactspeechonsettimeduringreferenceproduction?In34thAnnualMeetingoftheCognitiveScienceSociety,pages1584–1589,2012.(Citedonpage159.)

[97]ElenaGaura,JohnKemp,andJamesBrusey.Leveragingknowledgefromphysiologicaldata:On-bodyheatstressriskpredictionwithsensornet-works.IEEEtransactionsonbiomedicalcircuitsandsystems,7(6):861–870,2013.(Citedonpages93,94,98,and99.)

[98]RonaldLGellish,BrianRGoslin,RonaldEOlson,AUDRYMcDON-ALD,GaryDRussi,andVirinderKMoudgil.Longitudinalmodelingoftherelationshipbetweenageandmaximalheartrate.Medicineandscienceinsportsandexercise,39(5):822–829,2007.(Citedonpage95.)

[99]JamesEGentle,WolfgangKarlHärdle,andYuichiMori.Handbookofcomputationalstatistics:conceptsandmethods.SpringerScience&BusinessMedia,2012.(Citedonpage109.)

[100]JohnGialelis,PetrosChondros,DimitriosKaradimas,SofiaDima,andDimitriosSerpanos.Identifyingchronicdiseasecomplicationsutilizingstateoftheartdatafusionmethodologiesandsignalprocessingalgo-rithms.InKonstantinaS.Nikita,JamesC.Lin,DimitriosI.Fotiadis,

174 BIBLIOGRAPHY

[92] Peter Gärdenfors and Mary-Anne Williams. Reasoning about categoriesin conceptual spaces. In Proceedings of the 17th international joint con-ference on Artificial, IJCAI’01, pages 385–392, 2001. (Cited on pages25, 50, and 62.)

[93] Albert Gatt and Emiel Krahmer. Survey of the state of the art in naturallanguage generation: Core tasks, applications and evaluation. Journal ofArtificial Intelligence Research, 61:65–170, 2018. (Cited on pages 30,31, and 32.)

[94] Albert Gatt, Francois Portet, Ehud Reiter, Jim Hunter, Saad Mahamood,Wendy Moncur, and Somayajulu Sripada. From data to text in theneonatal intensive care unit: Using nlg technology for decision supportand information management. Ai Communications, 22(3):153–186,2009. (Cited on pages 31 and 32.)

[95] Albert Gatt and Ehud Reiter. Simplenlg: A realisation engine for prac-tical applications. In Proceedings of the 12th European Workshop onNatural Language Generation, pages 90–93. Association for Computa-tional Linguistics, 2009. (Cited on page 70.)

[96] Albert Gatt, Roger PG van Gompel, Emiel Krahmer, and Kees vanDeemter. Does domain size impact speech onset time during referenceproduction? In 34th Annual Meeting of the Cognitive Science Society,pages 1584–1589, 2012. (Cited on page 159.)

[97] Elena Gaura, John Kemp, and James Brusey. Leveraging knowledge fromphysiological data: On-body heat stress risk prediction with sensor net-works. IEEE transactions on biomedical circuits and systems, 7(6):861–870, 2013. (Cited on pages 93, 94, 98, and 99.)

[98] Ronald L Gellish, Brian R Goslin, Ronald E Olson, AUDRY McDON-ALD, Gary D Russi, and Virinder K Moudgil. Longitudinal modelingof the relationship between age and maximal heart rate. Medicine andscience in sports and exercise, 39(5):822–829, 2007. (Cited on page95.)

[99] James E Gentle, Wolfgang Karl Härdle, and Yuichi Mori. Handbookof computational statistics: concepts and methods. Springer Science &Business Media, 2012. (Cited on page 109.)

[100] John Gialelis, Petros Chondros, Dimitrios Karadimas, Sofia Dima, andDimitrios Serpanos. Identifying chronic disease complications utilizingstate of the art data fusion methodologies and signal processing algo-rithms. In Konstantina S. Nikita, James C. Lin, Dimitrios I. Fotiadis,

174BIBLIOGRAPHY

[92]PeterGärdenforsandMary-AnneWilliams.Reasoningaboutcategoriesinconceptualspaces.InProceedingsofthe17thinternationaljointcon-ferenceonArtificial,IJCAI’01,pages385–392,2001.(Citedonpages25,50,and62.)

[93]AlbertGattandEmielKrahmer.Surveyofthestateoftheartinnaturallanguagegeneration:Coretasks,applicationsandevaluation.JournalofArtificialIntelligenceResearch,61:65–170,2018.(Citedonpages30,31,and32.)

[94]AlbertGatt,FrancoisPortet,EhudReiter,JimHunter,SaadMahamood,WendyMoncur,andSomayajuluSripada.Fromdatatotextintheneonatalintensivecareunit:Usingnlgtechnologyfordecisionsupportandinformationmanagement.AiCommunications,22(3):153–186,2009.(Citedonpages31and32.)

[95]AlbertGattandEhudReiter.Simplenlg:Arealisationengineforprac-ticalapplications.InProceedingsofthe12thEuropeanWorkshoponNaturalLanguageGeneration,pages90–93.AssociationforComputa-tionalLinguistics,2009.(Citedonpage70.)

[96]AlbertGatt,RogerPGvanGompel,EmielKrahmer,andKeesvanDeemter.Doesdomainsizeimpactspeechonsettimeduringreferenceproduction?In34thAnnualMeetingoftheCognitiveScienceSociety,pages1584–1589,2012.(Citedonpage159.)

[97]ElenaGaura,JohnKemp,andJamesBrusey.Leveragingknowledgefromphysiologicaldata:On-bodyheatstressriskpredictionwithsensornet-works.IEEEtransactionsonbiomedicalcircuitsandsystems,7(6):861–870,2013.(Citedonpages93,94,98,and99.)

[98]RonaldLGellish,BrianRGoslin,RonaldEOlson,AUDRYMcDON-ALD,GaryDRussi,andVirinderKMoudgil.Longitudinalmodelingoftherelationshipbetweenageandmaximalheartrate.Medicineandscienceinsportsandexercise,39(5):822–829,2007.(Citedonpage95.)

[99]JamesEGentle,WolfgangKarlHärdle,andYuichiMori.Handbookofcomputationalstatistics:conceptsandmethods.SpringerScience&BusinessMedia,2012.(Citedonpage109.)

[100]JohnGialelis,PetrosChondros,DimitriosKaradimas,SofiaDima,andDimitriosSerpanos.Identifyingchronicdiseasecomplicationsutilizingstateoftheartdatafusionmethodologiesandsignalprocessingalgo-rithms.InKonstantinaS.Nikita,JamesC.Lin,DimitriosI.Fotiadis,

BIBLIOGRAPHY 175

and Maria-Teresa Arredondo Waldmeyer, editors, Wireless Mobile Com-munication and Healthcare, volume 83, pages 256–263. Springer BerlinHeidelberg, 2012. (Cited on pages 93, 95, 96, and 98.)

[101] Donna Giri, U Rajendra Acharya, Roshan Joy Martis, S Vinitha Sree,Teik-Cheng Lim, Thajudin Ahamed VI, and Jasjit S Suri. Automateddiagnosis of coronary artery disease affected patients using lda, pca, icaand discrete wavelet transform. Knowledge-Based Systems, 37:274–282,2013. (Cited on pages 95, 96, 97, 98, and 99.)

[102] Dimitra Gkatzia. Data-driven approaches to content selection for data-to-text generation. PhD thesis, Heriot-Watt University, 2015. (Cited onpage 32.)

[103] Dimitra Gkatzia. Content selection in data-to-text systems: A survey.arXiv preprint arXiv:1610.08375, 2016. (Cited on page 29.)

[104] Eli Goldberg, Norbert Driedger, and Richard I Kittredge. Using natural-language processing to produce weather forecasts. IEEE Expert,9(2):45–53, 1994. (Cited on page 31.)

[105] Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff,Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody,Chung-Kang Peng, and H Eugene Stanley. Physiobank, physiotoolkit,and physionet. Circulation, 101(23):e215–e220, 2000. (Cited on pages99 and 107.)

[106] Dina Goldin, Ricardo Mardales, and George Nagy. In search of mean-ing for time series subsequence clustering: matching algorithms basedon a new distance measure. In Proceedings of the 15th ACM interna-tional conference on Information and knowledge management, pages347–356. ACM, 2006. (Cited on page 116.)

[107] Machon Gregory and Ben Shneiderman. Shape identification in temporaldata sets. In Expanding the Frontiers of Visual Analytics and Visualiza-tion, pages 305–321. Springer, 2012. (Cited on pages 140 and 141.)

[108] Thomas L Griffiths, Mark Steyvers, and Joshua B Tenenbaum. Top-ics in semantic representation. Psychological review, 114(2):211, 2007.(Cited on pages 19, 20, and 33.)

[109] Isabelle Guyon and André Elisseeff. An introduction to variable andfeature selection. Journal of machine learning research, 3(Mar):1157–1182, 2003. (Cited on page 38.)

[110] Isabelle Guyon, Steve Gunn, Masoud Nikravesh, and Lofti A Zadeh.Feature extraction: foundations and applications, volume 207. Springer,2008. (Cited on page 38.)

BIBLIOGRAPHY175

andMaria-TeresaArredondoWaldmeyer,editors,WirelessMobileCom-municationandHealthcare,volume83,pages256–263.SpringerBerlinHeidelberg,2012.(Citedonpages93,95,96,and98.)

[101]DonnaGiri,URajendraAcharya,RoshanJoyMartis,SVinithaSree,Teik-ChengLim,ThajudinAhamedVI,andJasjitSSuri.Automateddiagnosisofcoronaryarterydiseaseaffectedpatientsusinglda,pca,icaanddiscretewavelettransform.Knowledge-BasedSystems,37:274–282,2013.(Citedonpages95,96,97,98,and99.)

[102]DimitraGkatzia.Data-drivenapproachestocontentselectionfordata-to-textgeneration.PhDthesis,Heriot-WattUniversity,2015.(Citedonpage32.)

[103]DimitraGkatzia.Contentselectionindata-to-textsystems:Asurvey.arXivpreprintarXiv:1610.08375,2016.(Citedonpage29.)

[104]EliGoldberg,NorbertDriedger,andRichardIKittredge.Usingnatural-languageprocessingtoproduceweatherforecasts.IEEEExpert,9(2):45–53,1994.(Citedonpage31.)

[105]AryLGoldberger,LuisANAmaral,LeonGlass,JeffreyMHausdorff,PlamenChIvanov,RogerGMark,JosephEMietus,GeorgeBMoody,Chung-KangPeng,andHEugeneStanley.Physiobank,physiotoolkit,andphysionet.Circulation,101(23):e215–e220,2000.(Citedonpages99and107.)

[106]DinaGoldin,RicardoMardales,andGeorgeNagy.Insearchofmean-ingfortimeseriessubsequenceclustering:matchingalgorithmsbasedonanewdistancemeasure.InProceedingsofthe15thACMinterna-tionalconferenceonInformationandknowledgemanagement,pages347–356.ACM,2006.(Citedonpage116.)

[107]MachonGregoryandBenShneiderman.Shapeidentificationintemporaldatasets.InExpandingtheFrontiersofVisualAnalyticsandVisualiza-tion,pages305–321.Springer,2012.(Citedonpages140and141.)

[108]ThomasLGriffiths,MarkSteyvers,andJoshuaBTenenbaum.Top-icsinsemanticrepresentation.Psychologicalreview,114(2):211,2007.(Citedonpages19,20,and33.)

[109]IsabelleGuyonandAndréElisseeff.Anintroductiontovariableandfeatureselection.Journalofmachinelearningresearch,3(Mar):1157–1182,2003.(Citedonpage38.)

[110]IsabelleGuyon,SteveGunn,MasoudNikravesh,andLoftiAZadeh.Featureextraction:foundationsandapplications,volume207.Springer,2008.(Citedonpage38.)

BIBLIOGRAPHY 175

and Maria-Teresa Arredondo Waldmeyer, editors, Wireless Mobile Com-munication and Healthcare, volume 83, pages 256–263. Springer BerlinHeidelberg, 2012. (Cited on pages 93, 95, 96, and 98.)

[101] Donna Giri, U Rajendra Acharya, Roshan Joy Martis, S Vinitha Sree,Teik-Cheng Lim, Thajudin Ahamed VI, and Jasjit S Suri. Automateddiagnosis of coronary artery disease affected patients using lda, pca, icaand discrete wavelet transform. Knowledge-Based Systems, 37:274–282,2013. (Cited on pages 95, 96, 97, 98, and 99.)

[102] Dimitra Gkatzia. Data-driven approaches to content selection for data-to-text generation. PhD thesis, Heriot-Watt University, 2015. (Cited onpage 32.)

[103] Dimitra Gkatzia. Content selection in data-to-text systems: A survey.arXiv preprint arXiv:1610.08375, 2016. (Cited on page 29.)

[104] Eli Goldberg, Norbert Driedger, and Richard I Kittredge. Using natural-language processing to produce weather forecasts. IEEE Expert,9(2):45–53, 1994. (Cited on page 31.)

[105] Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff,Plamen Ch Ivanov, Roger G Mark, Joseph E Mietus, George B Moody,Chung-Kang Peng, and H Eugene Stanley. Physiobank, physiotoolkit,and physionet. Circulation, 101(23):e215–e220, 2000. (Cited on pages99 and 107.)

[106] Dina Goldin, Ricardo Mardales, and George Nagy. In search of mean-ing for time series subsequence clustering: matching algorithms basedon a new distance measure. In Proceedings of the 15th ACM interna-tional conference on Information and knowledge management, pages347–356. ACM, 2006. (Cited on page 116.)

[107] Machon Gregory and Ben Shneiderman. Shape identification in temporaldata sets. In Expanding the Frontiers of Visual Analytics and Visualiza-tion, pages 305–321. Springer, 2012. (Cited on pages 140 and 141.)

[108] Thomas L Griffiths, Mark Steyvers, and Joshua B Tenenbaum. Top-ics in semantic representation. Psychological review, 114(2):211, 2007.(Cited on pages 19, 20, and 33.)

[109] Isabelle Guyon and André Elisseeff. An introduction to variable andfeature selection. Journal of machine learning research, 3(Mar):1157–1182, 2003. (Cited on page 38.)

[110] Isabelle Guyon, Steve Gunn, Masoud Nikravesh, and Lofti A Zadeh.Feature extraction: foundations and applications, volume 207. Springer,2008. (Cited on page 38.)

BIBLIOGRAPHY175

andMaria-TeresaArredondoWaldmeyer,editors,WirelessMobileCom-municationandHealthcare,volume83,pages256–263.SpringerBerlinHeidelberg,2012.(Citedonpages93,95,96,and98.)

[101]DonnaGiri,URajendraAcharya,RoshanJoyMartis,SVinithaSree,Teik-ChengLim,ThajudinAhamedVI,andJasjitSSuri.Automateddiagnosisofcoronaryarterydiseaseaffectedpatientsusinglda,pca,icaanddiscretewavelettransform.Knowledge-BasedSystems,37:274–282,2013.(Citedonpages95,96,97,98,and99.)

[102]DimitraGkatzia.Data-drivenapproachestocontentselectionfordata-to-textgeneration.PhDthesis,Heriot-WattUniversity,2015.(Citedonpage32.)

[103]DimitraGkatzia.Contentselectionindata-to-textsystems:Asurvey.arXivpreprintarXiv:1610.08375,2016.(Citedonpage29.)

[104]EliGoldberg,NorbertDriedger,andRichardIKittredge.Usingnatural-languageprocessingtoproduceweatherforecasts.IEEEExpert,9(2):45–53,1994.(Citedonpage31.)

[105]AryLGoldberger,LuisANAmaral,LeonGlass,JeffreyMHausdorff,PlamenChIvanov,RogerGMark,JosephEMietus,GeorgeBMoody,Chung-KangPeng,andHEugeneStanley.Physiobank,physiotoolkit,andphysionet.Circulation,101(23):e215–e220,2000.(Citedonpages99and107.)

[106]DinaGoldin,RicardoMardales,andGeorgeNagy.Insearchofmean-ingfortimeseriessubsequenceclustering:matchingalgorithmsbasedonanewdistancemeasure.InProceedingsofthe15thACMinterna-tionalconferenceonInformationandknowledgemanagement,pages347–356.ACM,2006.(Citedonpage116.)

[107]MachonGregoryandBenShneiderman.Shapeidentificationintemporaldatasets.InExpandingtheFrontiersofVisualAnalyticsandVisualiza-tion,pages305–321.Springer,2012.(Citedonpages140and141.)

[108]ThomasLGriffiths,MarkSteyvers,andJoshuaBTenenbaum.Top-icsinsemanticrepresentation.Psychologicalreview,114(2):211,2007.(Citedonpages19,20,and33.)

[109]IsabelleGuyonandAndréElisseeff.Anintroductiontovariableandfeatureselection.Journalofmachinelearningresearch,3(Mar):1157–1182,2003.(Citedonpage38.)

[110]IsabelleGuyon,SteveGunn,MasoudNikravesh,andLoftiAZadeh.Featureextraction:foundationsandapplications,volume207.Springer,2008.(Citedonpage38.)

176 BIBLIOGRAPHY

[111] James Hampton and Helen Moss. Concepts and meaning: Introductionto the special issue on conceptual representation. Language and cogni-tive processes, 18(5-6):505–512, 2003. (Cited on pages 21 and 33.)

[112] James A Hampton. Typicality, graded membership, and vagueness. Cog-nitive Science, 31(3):355–384, 2007. (Cited on page 62.)

[113] Jing He, Yanchun Zhang, Guangyan Huang, Yefei Xin, Xiaohui Liu,Hao Lan Zhang, Stanley Chiang, and Hailun Zhang. An association ruleanalysis framework for complex physiological and genetic data. In In-ternational Conference on Health Information Science, pages 131–142.Springer, 2012. (Cited on page 121.)

[114] Jing He, Yanchun Zhang, Guangyan Huang, Yefei Xin, Xiaohui Liu,HaoLan Zhang, Stanley Chiang, and Hailun Zhang. An associationrule analysis framework for complex physiological and genetic data. InJing He, Xiaohui Liu, ElizabethA. Krupinski, and Guandong Xu, editors,Health Information Science, volume 7231 of Lecture Notes in ComputerScience, pages 131–142. Springer Berlin Heidelberg, 2012. (Cited onpage 98.)

[115] Åke Hjalmarson. Heart rate: an independent risk factor in cardiovas-cular disease. European Heart Journal Supplements, 9(suppl_F):F3–F7,2007. (Cited on page 95.)

[116] Michael Holender, Rakesh Nagi, Moises Sudit, and J Terry Rickard. In-formation fusion using conceptual spaces: Mathematical programmingmodels and methods. In Information Fusion, 2007 10th InternationalConference on, pages 1–8. IEEE, 2007. (Cited on pages 3, 21, 23,and 24.)

[117] Fei Hu, Meng Jiang, Laura Celentano, and Yang Xiao. Robust medicalad hoc sensor networks (masn) with wavelet-based ecg data mining. AdHoc Networks, 6(7):986–1012, 2008. (Cited on pages 96, 97, and 98.)

[118] Guangyan Huang, Yanchun Zhang, Jie Cao, Michael Steyn, and KersiTaraporewalla. Online mining abnormal period patterns from multi-ple medical sensor data streams. World Wide Web, pages 1–19, 2013.(Cited on pages 93, 99, 100, 101, and 102.)

[119] James Hunter, Yvonne Freer, Albert Gatt, Ehud Reiter, Somayajulu Sri-pada, and Cindy Sykes. Automatic generation of natural language nurs-ing shift summaries in neonatal intensive care: Bt-nurse. Artificial intel-ligence in medicine, 56(3):157–172, 2012. (Cited on pages 30, 31, 103,and 132.)

176BIBLIOGRAPHY

[111]JamesHamptonandHelenMoss.Conceptsandmeaning:Introductiontothespecialissueonconceptualrepresentation.Languageandcogni-tiveprocesses,18(5-6):505–512,2003.(Citedonpages21and33.)

[112]JamesAHampton.Typicality,gradedmembership,andvagueness.Cog-nitiveScience,31(3):355–384,2007.(Citedonpage62.)

[113]JingHe,YanchunZhang,GuangyanHuang,YefeiXin,XiaohuiLiu,HaoLanZhang,StanleyChiang,andHailunZhang.Anassociationruleanalysisframeworkforcomplexphysiologicalandgeneticdata.InIn-ternationalConferenceonHealthInformationScience,pages131–142.Springer,2012.(Citedonpage121.)

[114]JingHe,YanchunZhang,GuangyanHuang,YefeiXin,XiaohuiLiu,HaoLanZhang,StanleyChiang,andHailunZhang.Anassociationruleanalysisframeworkforcomplexphysiologicalandgeneticdata.InJingHe,XiaohuiLiu,ElizabethA.Krupinski,andGuandongXu,editors,HealthInformationScience,volume7231ofLectureNotesinComputerScience,pages131–142.SpringerBerlinHeidelberg,2012.(Citedonpage98.)

[115]ÅkeHjalmarson.Heartrate:anindependentriskfactorincardiovas-culardisease.EuropeanHeartJournalSupplements,9(suppl_F):F3–F7,2007.(Citedonpage95.)

[116]MichaelHolender,RakeshNagi,MoisesSudit,andJTerryRickard.In-formationfusionusingconceptualspaces:Mathematicalprogrammingmodelsandmethods.InInformationFusion,200710thInternationalConferenceon,pages1–8.IEEE,2007.(Citedonpages3,21,23,and24.)

[117]FeiHu,MengJiang,LauraCelentano,andYangXiao.Robustmedicaladhocsensornetworks(masn)withwavelet-basedecgdatamining.AdHocNetworks,6(7):986–1012,2008.(Citedonpages96,97,and98.)

[118]GuangyanHuang,YanchunZhang,JieCao,MichaelSteyn,andKersiTaraporewalla.Onlineminingabnormalperiodpatternsfrommulti-plemedicalsensordatastreams.WorldWideWeb,pages1–19,2013.(Citedonpages93,99,100,101,and102.)

[119]JamesHunter,YvonneFreer,AlbertGatt,EhudReiter,SomayajuluSri-pada,andCindySykes.Automaticgenerationofnaturallanguagenurs-ingshiftsummariesinneonatalintensivecare:Bt-nurse.Artificialintel-ligenceinmedicine,56(3):157–172,2012.(Citedonpages30,31,103,and132.)

176 BIBLIOGRAPHY

[111] James Hampton and Helen Moss. Concepts and meaning: Introductionto the special issue on conceptual representation. Language and cogni-tive processes, 18(5-6):505–512, 2003. (Cited on pages 21 and 33.)

[112] James A Hampton. Typicality, graded membership, and vagueness. Cog-nitive Science, 31(3):355–384, 2007. (Cited on page 62.)

[113] Jing He, Yanchun Zhang, Guangyan Huang, Yefei Xin, Xiaohui Liu,Hao Lan Zhang, Stanley Chiang, and Hailun Zhang. An association ruleanalysis framework for complex physiological and genetic data. In In-ternational Conference on Health Information Science, pages 131–142.Springer, 2012. (Cited on page 121.)

[114] Jing He, Yanchun Zhang, Guangyan Huang, Yefei Xin, Xiaohui Liu,HaoLan Zhang, Stanley Chiang, and Hailun Zhang. An associationrule analysis framework for complex physiological and genetic data. InJing He, Xiaohui Liu, ElizabethA. Krupinski, and Guandong Xu, editors,Health Information Science, volume 7231 of Lecture Notes in ComputerScience, pages 131–142. Springer Berlin Heidelberg, 2012. (Cited onpage 98.)

[115] Åke Hjalmarson. Heart rate: an independent risk factor in cardiovas-cular disease. European Heart Journal Supplements, 9(suppl_F):F3–F7,2007. (Cited on page 95.)

[116] Michael Holender, Rakesh Nagi, Moises Sudit, and J Terry Rickard. In-formation fusion using conceptual spaces: Mathematical programmingmodels and methods. In Information Fusion, 2007 10th InternationalConference on, pages 1–8. IEEE, 2007. (Cited on pages 3, 21, 23,and 24.)

[117] Fei Hu, Meng Jiang, Laura Celentano, and Yang Xiao. Robust medicalad hoc sensor networks (masn) with wavelet-based ecg data mining. AdHoc Networks, 6(7):986–1012, 2008. (Cited on pages 96, 97, and 98.)

[118] Guangyan Huang, Yanchun Zhang, Jie Cao, Michael Steyn, and KersiTaraporewalla. Online mining abnormal period patterns from multi-ple medical sensor data streams. World Wide Web, pages 1–19, 2013.(Cited on pages 93, 99, 100, 101, and 102.)

[119] James Hunter, Yvonne Freer, Albert Gatt, Ehud Reiter, Somayajulu Sri-pada, and Cindy Sykes. Automatic generation of natural language nurs-ing shift summaries in neonatal intensive care: Bt-nurse. Artificial intel-ligence in medicine, 56(3):157–172, 2012. (Cited on pages 30, 31, 103,and 132.)

176BIBLIOGRAPHY

[111]JamesHamptonandHelenMoss.Conceptsandmeaning:Introductiontothespecialissueonconceptualrepresentation.Languageandcogni-tiveprocesses,18(5-6):505–512,2003.(Citedonpages21and33.)

[112]JamesAHampton.Typicality,gradedmembership,andvagueness.Cog-nitiveScience,31(3):355–384,2007.(Citedonpage62.)

[113]JingHe,YanchunZhang,GuangyanHuang,YefeiXin,XiaohuiLiu,HaoLanZhang,StanleyChiang,andHailunZhang.Anassociationruleanalysisframeworkforcomplexphysiologicalandgeneticdata.InIn-ternationalConferenceonHealthInformationScience,pages131–142.Springer,2012.(Citedonpage121.)

[114]JingHe,YanchunZhang,GuangyanHuang,YefeiXin,XiaohuiLiu,HaoLanZhang,StanleyChiang,andHailunZhang.Anassociationruleanalysisframeworkforcomplexphysiologicalandgeneticdata.InJingHe,XiaohuiLiu,ElizabethA.Krupinski,andGuandongXu,editors,HealthInformationScience,volume7231ofLectureNotesinComputerScience,pages131–142.SpringerBerlinHeidelberg,2012.(Citedonpage98.)

[115]ÅkeHjalmarson.Heartrate:anindependentriskfactorincardiovas-culardisease.EuropeanHeartJournalSupplements,9(suppl_F):F3–F7,2007.(Citedonpage95.)

[116]MichaelHolender,RakeshNagi,MoisesSudit,andJTerryRickard.In-formationfusionusingconceptualspaces:Mathematicalprogrammingmodelsandmethods.InInformationFusion,200710thInternationalConferenceon,pages1–8.IEEE,2007.(Citedonpages3,21,23,and24.)

[117]FeiHu,MengJiang,LauraCelentano,andYangXiao.Robustmedicaladhocsensornetworks(masn)withwavelet-basedecgdatamining.AdHocNetworks,6(7):986–1012,2008.(Citedonpages96,97,and98.)

[118]GuangyanHuang,YanchunZhang,JieCao,MichaelSteyn,andKersiTaraporewalla.Onlineminingabnormalperiodpatternsfrommulti-plemedicalsensordatastreams.WorldWideWeb,pages1–19,2013.(Citedonpages93,99,100,101,and102.)

[119]JamesHunter,YvonneFreer,AlbertGatt,EhudReiter,SomayajuluSri-pada,andCindySykes.Automaticgenerationofnaturallanguagenurs-ingshiftsummariesinneonatalintensivecare:Bt-nurse.Artificialintel-ligenceinmedicine,56(3):157–172,2012.(Citedonpages30,31,103,and132.)

BIBLIOGRAPHY 177

[120] Lidija Iordanskaja, Myunghee Kim, Richard Kittredge, Benoit Lavoie,and Alain Polguere. Generation of extended bilingual statistical reports.In Proceedings of the 14th conference on Computational linguistics-Volume 3, pages 1019–1023. Association for Computational Linguistics,1992. (Cited on page 31.)

[121] Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recogni-tion letters, 31(8):651–666, 2010. (Cited on page 116.)

[122] Tony Jebara. Machine learning: discriminative and generative, volume755. Springer Science & Business Media, 2012. (Cited on page 83.)

[123] Zhanpeng Jin, Yuwen Sun, and A. C. Cheng. Predicting cardiovascu-lar disease from real-time electrocardiographic monitoring: An adaptivemachine learning approach on a cell phone. In Proceedings of the An-nual International Conference of the IEEE Engineering in Medicine andBiology Society, pages 6889–6892, 2009. (Cited on page 100.)

[124] Janusz Kacprzyk, Anna Wilbik, and S Zadrozny. Linguistic summariza-tion of time series using a fuzzy quantifier driven aggregation. Fuzzy Setsand Systems, 159(12):1485–1499, 2008. (Cited on pages 111 and 139.)

[125] Janusz Kacprzyk and Sławomir Zadrozny. Linguistic database sum-maries and their protoforms: towards natural language based knowledgediscovery tools. Information Sciences, 173(4):281–304, 2005. (Citedon page 28.)

[126] Janusz Kacprzyk and Sławomir Zadrozny. Computing with words isan implementable paradigm: fuzzy queries, linguistic data summaries,and natural-language generation. IEEE Transactions on Fuzzy Systems,18(3):461–472, 2010. (Cited on pages 27 and 28.)

[127] Jayant Kalagnanam and Max Henrion. A comparison of decision analy-sis and expert rules for sequential diagnosis. In Machine Intelligence andPattern Recognition, volume 9, pages 271–281. Elsevier, 1990. (Citedon page 98.)

[128] Walter Karlen, Claudio Mattiussi, and Dario Floreano. Sleep and wakeclassification with ecg and respiratory effort signals. IEEE Transactionson Biomedical Circuits and Systems, 3(2):71–78, 2009. (Cited on pages95 and 98.)

[129] Eamonn Keogh. Instance-based learning. In Encyclopedia of MachineLearning, pages 549–550. Springer, 2011. (Cited on page 25.)

[130] Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani. Segment-ing time series: A survey and novel approach. Data mining in time seriesdatabases, 57:1–22, 2004. (Cited on page 110.)

BIBLIOGRAPHY177

[120]LidijaIordanskaja,MyungheeKim,RichardKittredge,BenoitLavoie,andAlainPolguere.Generationofextendedbilingualstatisticalreports.InProceedingsofthe14thconferenceonComputationallinguistics-Volume3,pages1019–1023.AssociationforComputationalLinguistics,1992.(Citedonpage31.)

[121]AnilKJain.Dataclustering:50yearsbeyondk-means.Patternrecogni-tionletters,31(8):651–666,2010.(Citedonpage116.)

[122]TonyJebara.Machinelearning:discriminativeandgenerative,volume755.SpringerScience&BusinessMedia,2012.(Citedonpage83.)

[123]ZhanpengJin,YuwenSun,andA.C.Cheng.Predictingcardiovascu-lardiseasefromreal-timeelectrocardiographicmonitoring:Anadaptivemachinelearningapproachonacellphone.InProceedingsoftheAn-nualInternationalConferenceoftheIEEEEngineeringinMedicineandBiologySociety,pages6889–6892,2009.(Citedonpage100.)

[124]JanuszKacprzyk,AnnaWilbik,andSZadrozny.Linguisticsummariza-tionoftimeseriesusingafuzzyquantifierdrivenaggregation.FuzzySetsandSystems,159(12):1485–1499,2008.(Citedonpages111and139.)

[125]JanuszKacprzykandSławomirZadrozny.Linguisticdatabasesum-mariesandtheirprotoforms:towardsnaturallanguagebasedknowledgediscoverytools.InformationSciences,173(4):281–304,2005.(Citedonpage28.)

[126]JanuszKacprzykandSławomirZadrozny.Computingwithwordsisanimplementableparadigm:fuzzyqueries,linguisticdatasummaries,andnatural-languagegeneration.IEEETransactionsonFuzzySystems,18(3):461–472,2010.(Citedonpages27and28.)

[127]JayantKalagnanamandMaxHenrion.Acomparisonofdecisionanaly-sisandexpertrulesforsequentialdiagnosis.InMachineIntelligenceandPatternRecognition,volume9,pages271–281.Elsevier,1990.(Citedonpage98.)

[128]WalterKarlen,ClaudioMattiussi,andDarioFloreano.Sleepandwakeclassificationwithecgandrespiratoryeffortsignals.IEEETransactionsonBiomedicalCircuitsandSystems,3(2):71–78,2009.(Citedonpages95and98.)

[129]EamonnKeogh.Instance-basedlearning.InEncyclopediaofMachineLearning,pages549–550.Springer,2011.(Citedonpage25.)

[130]EamonnKeogh,SelinaChu,DavidHart,andMichaelPazzani.Segment-ingtimeseries:Asurveyandnovelapproach.Dataminingintimeseriesdatabases,57:1–22,2004.(Citedonpage110.)

BIBLIOGRAPHY 177

[120] Lidija Iordanskaja, Myunghee Kim, Richard Kittredge, Benoit Lavoie,and Alain Polguere. Generation of extended bilingual statistical reports.In Proceedings of the 14th conference on Computational linguistics-Volume 3, pages 1019–1023. Association for Computational Linguistics,1992. (Cited on page 31.)

[121] Anil K Jain. Data clustering: 50 years beyond k-means. Pattern recogni-tion letters, 31(8):651–666, 2010. (Cited on page 116.)

[122] Tony Jebara. Machine learning: discriminative and generative, volume755. Springer Science & Business Media, 2012. (Cited on page 83.)

[123] Zhanpeng Jin, Yuwen Sun, and A. C. Cheng. Predicting cardiovascu-lar disease from real-time electrocardiographic monitoring: An adaptivemachine learning approach on a cell phone. In Proceedings of the An-nual International Conference of the IEEE Engineering in Medicine andBiology Society, pages 6889–6892, 2009. (Cited on page 100.)

[124] Janusz Kacprzyk, Anna Wilbik, and S Zadrozny. Linguistic summariza-tion of time series using a fuzzy quantifier driven aggregation. Fuzzy Setsand Systems, 159(12):1485–1499, 2008. (Cited on pages 111 and 139.)

[125] Janusz Kacprzyk and Sławomir Zadrozny. Linguistic database sum-maries and their protoforms: towards natural language based knowledgediscovery tools. Information Sciences, 173(4):281–304, 2005. (Citedon page 28.)

[126] Janusz Kacprzyk and Sławomir Zadrozny. Computing with words isan implementable paradigm: fuzzy queries, linguistic data summaries,and natural-language generation. IEEE Transactions on Fuzzy Systems,18(3):461–472, 2010. (Cited on pages 27 and 28.)

[127] Jayant Kalagnanam and Max Henrion. A comparison of decision analy-sis and expert rules for sequential diagnosis. In Machine Intelligence andPattern Recognition, volume 9, pages 271–281. Elsevier, 1990. (Citedon page 98.)

[128] Walter Karlen, Claudio Mattiussi, and Dario Floreano. Sleep and wakeclassification with ecg and respiratory effort signals. IEEE Transactionson Biomedical Circuits and Systems, 3(2):71–78, 2009. (Cited on pages95 and 98.)

[129] Eamonn Keogh. Instance-based learning. In Encyclopedia of MachineLearning, pages 549–550. Springer, 2011. (Cited on page 25.)

[130] Eamonn Keogh, Selina Chu, David Hart, and Michael Pazzani. Segment-ing time series: A survey and novel approach. Data mining in time seriesdatabases, 57:1–22, 2004. (Cited on page 110.)

BIBLIOGRAPHY177

[120]LidijaIordanskaja,MyungheeKim,RichardKittredge,BenoitLavoie,andAlainPolguere.Generationofextendedbilingualstatisticalreports.InProceedingsofthe14thconferenceonComputationallinguistics-Volume3,pages1019–1023.AssociationforComputationalLinguistics,1992.(Citedonpage31.)

[121]AnilKJain.Dataclustering:50yearsbeyondk-means.Patternrecogni-tionletters,31(8):651–666,2010.(Citedonpage116.)

[122]TonyJebara.Machinelearning:discriminativeandgenerative,volume755.SpringerScience&BusinessMedia,2012.(Citedonpage83.)

[123]ZhanpengJin,YuwenSun,andA.C.Cheng.Predictingcardiovascu-lardiseasefromreal-timeelectrocardiographicmonitoring:Anadaptivemachinelearningapproachonacellphone.InProceedingsoftheAn-nualInternationalConferenceoftheIEEEEngineeringinMedicineandBiologySociety,pages6889–6892,2009.(Citedonpage100.)

[124]JanuszKacprzyk,AnnaWilbik,andSZadrozny.Linguisticsummariza-tionoftimeseriesusingafuzzyquantifierdrivenaggregation.FuzzySetsandSystems,159(12):1485–1499,2008.(Citedonpages111and139.)

[125]JanuszKacprzykandSławomirZadrozny.Linguisticdatabasesum-mariesandtheirprotoforms:towardsnaturallanguagebasedknowledgediscoverytools.InformationSciences,173(4):281–304,2005.(Citedonpage28.)

[126]JanuszKacprzykandSławomirZadrozny.Computingwithwordsisanimplementableparadigm:fuzzyqueries,linguisticdatasummaries,andnatural-languagegeneration.IEEETransactionsonFuzzySystems,18(3):461–472,2010.(Citedonpages27and28.)

[127]JayantKalagnanamandMaxHenrion.Acomparisonofdecisionanaly-sisandexpertrulesforsequentialdiagnosis.InMachineIntelligenceandPatternRecognition,volume9,pages271–281.Elsevier,1990.(Citedonpage98.)

[128]WalterKarlen,ClaudioMattiussi,andDarioFloreano.Sleepandwakeclassificationwithecgandrespiratoryeffortsignals.IEEETransactionsonBiomedicalCircuitsandSystems,3(2):71–78,2009.(Citedonpages95and98.)

[129]EamonnKeogh.Instance-basedlearning.InEncyclopediaofMachineLearning,pages549–550.Springer,2011.(Citedonpage25.)

[130]EamonnKeogh,SelinaChu,DavidHart,andMichaelPazzani.Segment-ingtimeseries:Asurveyandnovelapproach.Dataminingintimeseriesdatabases,57:1–22,2004.(Citedonpage110.)

178 BIBLIOGRAPHY

[131] Carsten Keßler. Conceptual spaces for data descriptions. In The cog-nitive approach to modeling environments (CAME), workshop at GI-Science, pages 29–35, 2006. (Cited on pages 4, 26, and 50.)

[132] Jonghun Kim, Jaekwon Kim, Daesung Lee, and Kyung-Yong Chung. On-tology driven interactive healthcare with wearable sensors. MultimediaTools and Applications, 71(2):827–841, 2014. (Cited on page 103.)

[133] Rik Koncel-Kedziorski, Hannaneh Hajishirzi, and Ali Farhadi. Multi-resolution language grounding with weak supervision. In Proceedings ofthe 2014 Conference on Empirical Methods in Natural Language Pro-cessing (EMNLP), pages 386–396, 2014. (Cited on page 32.)

[134] Sotiris Kotsiantis and Dimitris Kanellopoulos. Association rules min-ing: A recent overview. GESTS International Transactions on Com-puter Science and Engineering, 32(1):71–82, 2006. (Cited on pages113 and 121.)

[135] Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. Learn-ing to detect unseen object classes by between-class attribute transfer. InComputer Vision and Pattern Recognition, pages 951–958, June 2009.(Cited on page 160.)

[136] Martin Längkvist, Lars Karlsson, and Amy Loutfi. Sleep stage classifica-tion using unsupervised feature learning. Advances in Artificial NeuralSystems, 2012. (Cited on page 102.)

[137] Oscar D Lara and Miguel A Labrador. A survey on human activityrecognition using wearable sensors. IEEE Communications Surveys andTutorials, 15(3):1192–1209, 2013. (Cited on page 96.)

[138] Kevin LeBlanc. Cooperative anchoring: sharing information about ob-jects in multi-robot systems. 2010. (Cited on page 25.)

[139] Ickjai Lee. Data mining coupled conceptual spaces for intelligent agentsin data-rich environments. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, pages 42–48. Springer, 2005. (Cited on page 26.)

[140] Ickjai Lee and Bayani Portier. An empirical study of knowledge repre-sentation and learning within conceptual spaces for intelligent agents. In6th IEEE/ACIS International Conference on Computer and InformationScience, ICIS 2007., pages 463–468. IEEE, 2007. (Cited on page 26.)

[141] Kyong Ho Lee, Sun-Yuan Kung, and Naveen Verma. Low-energy formu-lations of support vector machine kernel functions for biomedical sen-sor applications. Journal of Signal Processing Systems, 69(3):339–349,2012. (Cited on pages 93, 97, 98, 99, and 100.)

178BIBLIOGRAPHY

[131]CarstenKeßler.Conceptualspacesfordatadescriptions.InThecog-nitiveapproachtomodelingenvironments(CAME),workshopatGI-Science,pages29–35,2006.(Citedonpages4,26,and50.)

[132]JonghunKim,JaekwonKim,DaesungLee,andKyung-YongChung.On-tologydriveninteractivehealthcarewithwearablesensors.MultimediaToolsandApplications,71(2):827–841,2014.(Citedonpage103.)

[133]RikKoncel-Kedziorski,HannanehHajishirzi,andAliFarhadi.Multi-resolutionlanguagegroundingwithweaksupervision.InProceedingsofthe2014ConferenceonEmpiricalMethodsinNaturalLanguagePro-cessing(EMNLP),pages386–396,2014.(Citedonpage32.)

[134]SotirisKotsiantisandDimitrisKanellopoulos.Associationrulesmin-ing:Arecentoverview.GESTSInternationalTransactionsonCom-puterScienceandEngineering,32(1):71–82,2006.(Citedonpages113and121.)

[135]ChristophHLampert,HannesNickisch,andStefanHarmeling.Learn-ingtodetectunseenobjectclassesbybetween-classattributetransfer.InComputerVisionandPatternRecognition,pages951–958,June2009.(Citedonpage160.)

[136]MartinLängkvist,LarsKarlsson,andAmyLoutfi.Sleepstageclassifica-tionusingunsupervisedfeaturelearning.AdvancesinArtificialNeuralSystems,2012.(Citedonpage102.)

[137]OscarDLaraandMiguelALabrador.Asurveyonhumanactivityrecognitionusingwearablesensors.IEEECommunicationsSurveysandTutorials,15(3):1192–1209,2013.(Citedonpage96.)

[138]KevinLeBlanc.Cooperativeanchoring:sharinginformationaboutob-jectsinmulti-robotsystems.2010.(Citedonpage25.)

[139]IckjaiLee.Dataminingcoupledconceptualspacesforintelligentagentsindata-richenvironments.InInternationalConferenceonKnowledge-BasedandIntelligentInformationandEngineeringSystems,pages42–48.Springer,2005.(Citedonpage26.)

[140]IckjaiLeeandBayaniPortier.Anempiricalstudyofknowledgerepre-sentationandlearningwithinconceptualspacesforintelligentagents.In6thIEEE/ACISInternationalConferenceonComputerandInformationScience,ICIS2007.,pages463–468.IEEE,2007.(Citedonpage26.)

[141]KyongHoLee,Sun-YuanKung,andNaveenVerma.Low-energyformu-lationsofsupportvectormachinekernelfunctionsforbiomedicalsen-sorapplications.JournalofSignalProcessingSystems,69(3):339–349,2012.(Citedonpages93,97,98,99,and100.)

178 BIBLIOGRAPHY

[131] Carsten Keßler. Conceptual spaces for data descriptions. In The cog-nitive approach to modeling environments (CAME), workshop at GI-Science, pages 29–35, 2006. (Cited on pages 4, 26, and 50.)

[132] Jonghun Kim, Jaekwon Kim, Daesung Lee, and Kyung-Yong Chung. On-tology driven interactive healthcare with wearable sensors. MultimediaTools and Applications, 71(2):827–841, 2014. (Cited on page 103.)

[133] Rik Koncel-Kedziorski, Hannaneh Hajishirzi, and Ali Farhadi. Multi-resolution language grounding with weak supervision. In Proceedings ofthe 2014 Conference on Empirical Methods in Natural Language Pro-cessing (EMNLP), pages 386–396, 2014. (Cited on page 32.)

[134] Sotiris Kotsiantis and Dimitris Kanellopoulos. Association rules min-ing: A recent overview. GESTS International Transactions on Com-puter Science and Engineering, 32(1):71–82, 2006. (Cited on pages113 and 121.)

[135] Christoph H Lampert, Hannes Nickisch, and Stefan Harmeling. Learn-ing to detect unseen object classes by between-class attribute transfer. InComputer Vision and Pattern Recognition, pages 951–958, June 2009.(Cited on page 160.)

[136] Martin Längkvist, Lars Karlsson, and Amy Loutfi. Sleep stage classifica-tion using unsupervised feature learning. Advances in Artificial NeuralSystems, 2012. (Cited on page 102.)

[137] Oscar D Lara and Miguel A Labrador. A survey on human activityrecognition using wearable sensors. IEEE Communications Surveys andTutorials, 15(3):1192–1209, 2013. (Cited on page 96.)

[138] Kevin LeBlanc. Cooperative anchoring: sharing information about ob-jects in multi-robot systems. 2010. (Cited on page 25.)

[139] Ickjai Lee. Data mining coupled conceptual spaces for intelligent agentsin data-rich environments. In International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, pages 42–48. Springer, 2005. (Cited on page 26.)

[140] Ickjai Lee and Bayani Portier. An empirical study of knowledge repre-sentation and learning within conceptual spaces for intelligent agents. In6th IEEE/ACIS International Conference on Computer and InformationScience, ICIS 2007., pages 463–468. IEEE, 2007. (Cited on page 26.)

[141] Kyong Ho Lee, Sun-Yuan Kung, and Naveen Verma. Low-energy formu-lations of support vector machine kernel functions for biomedical sen-sor applications. Journal of Signal Processing Systems, 69(3):339–349,2012. (Cited on pages 93, 97, 98, 99, and 100.)

178BIBLIOGRAPHY

[131]CarstenKeßler.Conceptualspacesfordatadescriptions.InThecog-nitiveapproachtomodelingenvironments(CAME),workshopatGI-Science,pages29–35,2006.(Citedonpages4,26,and50.)

[132]JonghunKim,JaekwonKim,DaesungLee,andKyung-YongChung.On-tologydriveninteractivehealthcarewithwearablesensors.MultimediaToolsandApplications,71(2):827–841,2014.(Citedonpage103.)

[133]RikKoncel-Kedziorski,HannanehHajishirzi,andAliFarhadi.Multi-resolutionlanguagegroundingwithweaksupervision.InProceedingsofthe2014ConferenceonEmpiricalMethodsinNaturalLanguagePro-cessing(EMNLP),pages386–396,2014.(Citedonpage32.)

[134]SotirisKotsiantisandDimitrisKanellopoulos.Associationrulesmin-ing:Arecentoverview.GESTSInternationalTransactionsonCom-puterScienceandEngineering,32(1):71–82,2006.(Citedonpages113and121.)

[135]ChristophHLampert,HannesNickisch,andStefanHarmeling.Learn-ingtodetectunseenobjectclassesbybetween-classattributetransfer.InComputerVisionandPatternRecognition,pages951–958,June2009.(Citedonpage160.)

[136]MartinLängkvist,LarsKarlsson,andAmyLoutfi.Sleepstageclassifica-tionusingunsupervisedfeaturelearning.AdvancesinArtificialNeuralSystems,2012.(Citedonpage102.)

[137]OscarDLaraandMiguelALabrador.Asurveyonhumanactivityrecognitionusingwearablesensors.IEEECommunicationsSurveysandTutorials,15(3):1192–1209,2013.(Citedonpage96.)

[138]KevinLeBlanc.Cooperativeanchoring:sharinginformationaboutob-jectsinmulti-robotsystems.2010.(Citedonpage25.)

[139]IckjaiLee.Dataminingcoupledconceptualspacesforintelligentagentsindata-richenvironments.InInternationalConferenceonKnowledge-BasedandIntelligentInformationandEngineeringSystems,pages42–48.Springer,2005.(Citedonpage26.)

[140]IckjaiLeeandBayaniPortier.Anempiricalstudyofknowledgerepre-sentationandlearningwithinconceptualspacesforintelligentagents.In6thIEEE/ACISInternationalConferenceonComputerandInformationScience,ICIS2007.,pages463–468.IEEE,2007.(Citedonpage26.)

[141]KyongHoLee,Sun-YuanKung,andNaveenVerma.Low-energyformu-lationsofsupportvectormachinekernelfunctionsforbiomedicalsen-sorapplications.JournalofSignalProcessingSystems,69(3):339–349,2012.(Citedonpages93,97,98,99,and100.)

BIBLIOGRAPHY 179

[142] Martha Lewis and Jonathan Lawry. Hierarchical conceptual spaces forconcept combination. Artificial Intelligence, 237:204–227, 2016. (Citedon page 66.)

[143] Qiao Li and Gari D Clifford. Dynamic time warping and machine learn-ing for signal quality assessment of pulsatile signals. Physiological mea-surement, 33(9):1491–1501, 2012. (Cited on page 98.)

[144] Xiaokun Li and Fatih Porikli. Human state classification and predi-cation for critical care monitoring by real-time bio-signal analysis. InProceedings of the 20th International Conference on Pattern Recogni-tion, pages 2460–2463, Washington, DC, USA, 2010. IEEE ComputerSociety. (Cited on page 101.)

[145] T Warren Liao. Clustering of time series data–a survey. Pattern recogni-tion, 38(11):1857–1874, 2005. (Cited on page 114.)

[146] Moshe Lichman. UCI machine learning repository. Available online:http://archive.ics.uci.edu/ml, 2013. (Cited on page 70.)

[147] Antonio Lieto, Antonio Chella, and Marcello Frixione. Conceptualspaces for cognitive architectures: A lingua franca for different levelsof representation. Biologically Inspired Cognitive Architectures, 19:1–9,2017. (Cited on page 25.)

[148] Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. A symbolicrepresentation of time series, with implications for streaming algorithms.In Proceedings of the 8th ACM SIGMOD workshop on Research is-sues in data mining and knowledge discovery, pages 2–11. ACM, 2003.(Cited on page 110.)

[149] Catherine Loader. Smoothing: local regression techniques. In Handbookof Computational Statistics, pages 571–596. Springer, 2012. (Cited onpage 110.)

[150] Joan Albert López-Vallverdú, David RiañO, and John A Bohada. Im-proving medical decision trees by combining relevant health-care criteria.Expert Systems with Applications, 39(14):11782–11791, 2012. (Citedon page 98.)

[151] George F Luger. Artificial intelligence: structures and strategies for com-plex problem solving. Pearson education, 2005. (Cited on page 25.)

[152] Yi Mao, Wenlin Chen, Yixin Chen, Chenyang Lu, Marin Kollef, andThomas Bailey. An integrated data mining approach to real-time clinicalmonitoring and deterioration warning. In Proceedings of the 18th ACMSIGKDD international conference on Knowledge discovery and data

BIBLIOGRAPHY179

[142]MarthaLewisandJonathanLawry.Hierarchicalconceptualspacesforconceptcombination.ArtificialIntelligence,237:204–227,2016.(Citedonpage66.)

[143]QiaoLiandGariDClifford.Dynamictimewarpingandmachinelearn-ingforsignalqualityassessmentofpulsatilesignals.Physiologicalmea-surement,33(9):1491–1501,2012.(Citedonpage98.)

[144]XiaokunLiandFatihPorikli.Humanstateclassificationandpredi-cationforcriticalcaremonitoringbyreal-timebio-signalanalysis.InProceedingsofthe20thInternationalConferenceonPatternRecogni-tion,pages2460–2463,Washington,DC,USA,2010.IEEEComputerSociety.(Citedonpage101.)

[145]TWarrenLiao.Clusteringoftimeseriesdata–asurvey.Patternrecogni-tion,38(11):1857–1874,2005.(Citedonpage114.)

[146]MosheLichman.UCImachinelearningrepository.Availableonline:http://archive.ics.uci.edu/ml,2013.(Citedonpage70.)

[147]AntonioLieto,AntonioChella,andMarcelloFrixione.Conceptualspacesforcognitivearchitectures:Alinguafrancafordifferentlevelsofrepresentation.BiologicallyInspiredCognitiveArchitectures,19:1–9,2017.(Citedonpage25.)

[148]JessicaLin,EamonnKeogh,StefanoLonardi,andBillChiu.Asymbolicrepresentationoftimeseries,withimplicationsforstreamingalgorithms.InProceedingsofthe8thACMSIGMODworkshoponResearchis-suesindataminingandknowledgediscovery,pages2–11.ACM,2003.(Citedonpage110.)

[149]CatherineLoader.Smoothing:localregressiontechniques.InHandbookofComputationalStatistics,pages571–596.Springer,2012.(Citedonpage110.)

[150]JoanAlbertLópez-Vallverdú,DavidRiañO,andJohnABohada.Im-provingmedicaldecisiontreesbycombiningrelevanthealth-carecriteria.ExpertSystemswithApplications,39(14):11782–11791,2012.(Citedonpage98.)

[151]GeorgeFLuger.Artificialintelligence:structuresandstrategiesforcom-plexproblemsolving.Pearsoneducation,2005.(Citedonpage25.)

[152]YiMao,WenlinChen,YixinChen,ChenyangLu,MarinKollef,andThomasBailey.Anintegrateddataminingapproachtoreal-timeclinicalmonitoringanddeteriorationwarning.InProceedingsofthe18thACMSIGKDDinternationalconferenceonKnowledgediscoveryanddata

BIBLIOGRAPHY 179

[142] Martha Lewis and Jonathan Lawry. Hierarchical conceptual spaces forconcept combination. Artificial Intelligence, 237:204–227, 2016. (Citedon page 66.)

[143] Qiao Li and Gari D Clifford. Dynamic time warping and machine learn-ing for signal quality assessment of pulsatile signals. Physiological mea-surement, 33(9):1491–1501, 2012. (Cited on page 98.)

[144] Xiaokun Li and Fatih Porikli. Human state classification and predi-cation for critical care monitoring by real-time bio-signal analysis. InProceedings of the 20th International Conference on Pattern Recogni-tion, pages 2460–2463, Washington, DC, USA, 2010. IEEE ComputerSociety. (Cited on page 101.)

[145] T Warren Liao. Clustering of time series data–a survey. Pattern recogni-tion, 38(11):1857–1874, 2005. (Cited on page 114.)

[146] Moshe Lichman. UCI machine learning repository. Available online:http://archive.ics.uci.edu/ml, 2013. (Cited on page 70.)

[147] Antonio Lieto, Antonio Chella, and Marcello Frixione. Conceptualspaces for cognitive architectures: A lingua franca for different levelsof representation. Biologically Inspired Cognitive Architectures, 19:1–9,2017. (Cited on page 25.)

[148] Jessica Lin, Eamonn Keogh, Stefano Lonardi, and Bill Chiu. A symbolicrepresentation of time series, with implications for streaming algorithms.In Proceedings of the 8th ACM SIGMOD workshop on Research is-sues in data mining and knowledge discovery, pages 2–11. ACM, 2003.(Cited on page 110.)

[149] Catherine Loader. Smoothing: local regression techniques. In Handbookof Computational Statistics, pages 571–596. Springer, 2012. (Cited onpage 110.)

[150] Joan Albert López-Vallverdú, David RiañO, and John A Bohada. Im-proving medical decision trees by combining relevant health-care criteria.Expert Systems with Applications, 39(14):11782–11791, 2012. (Citedon page 98.)

[151] George F Luger. Artificial intelligence: structures and strategies for com-plex problem solving. Pearson education, 2005. (Cited on page 25.)

[152] Yi Mao, Wenlin Chen, Yixin Chen, Chenyang Lu, Marin Kollef, andThomas Bailey. An integrated data mining approach to real-time clinicalmonitoring and deterioration warning. In Proceedings of the 18th ACMSIGKDD international conference on Knowledge discovery and data

BIBLIOGRAPHY179

[142]MarthaLewisandJonathanLawry.Hierarchicalconceptualspacesforconceptcombination.ArtificialIntelligence,237:204–227,2016.(Citedonpage66.)

[143]QiaoLiandGariDClifford.Dynamictimewarpingandmachinelearn-ingforsignalqualityassessmentofpulsatilesignals.Physiologicalmea-surement,33(9):1491–1501,2012.(Citedonpage98.)

[144]XiaokunLiandFatihPorikli.Humanstateclassificationandpredi-cationforcriticalcaremonitoringbyreal-timebio-signalanalysis.InProceedingsofthe20thInternationalConferenceonPatternRecogni-tion,pages2460–2463,Washington,DC,USA,2010.IEEEComputerSociety.(Citedonpage101.)

[145]TWarrenLiao.Clusteringoftimeseriesdata–asurvey.Patternrecogni-tion,38(11):1857–1874,2005.(Citedonpage114.)

[146]MosheLichman.UCImachinelearningrepository.Availableonline:http://archive.ics.uci.edu/ml,2013.(Citedonpage70.)

[147]AntonioLieto,AntonioChella,andMarcelloFrixione.Conceptualspacesforcognitivearchitectures:Alinguafrancafordifferentlevelsofrepresentation.BiologicallyInspiredCognitiveArchitectures,19:1–9,2017.(Citedonpage25.)

[148]JessicaLin,EamonnKeogh,StefanoLonardi,andBillChiu.Asymbolicrepresentationoftimeseries,withimplicationsforstreamingalgorithms.InProceedingsofthe8thACMSIGMODworkshoponResearchis-suesindataminingandknowledgediscovery,pages2–11.ACM,2003.(Citedonpage110.)

[149]CatherineLoader.Smoothing:localregressiontechniques.InHandbookofComputationalStatistics,pages571–596.Springer,2012.(Citedonpage110.)

[150]JoanAlbertLópez-Vallverdú,DavidRiañO,andJohnABohada.Im-provingmedicaldecisiontreesbycombiningrelevanthealth-carecriteria.ExpertSystemswithApplications,39(14):11782–11791,2012.(Citedonpage98.)

[151]GeorgeFLuger.Artificialintelligence:structuresandstrategiesforcom-plexproblemsolving.Pearsoneducation,2005.(Citedonpage25.)

[152]YiMao,WenlinChen,YixinChen,ChenyangLu,MarinKollef,andThomasBailey.Anintegrateddataminingapproachtoreal-timeclinicalmonitoringanddeteriorationwarning.InProceedingsofthe18thACMSIGKDDinternationalconferenceonKnowledgediscoveryanddata

180 BIBLIOGRAPHY

mining, pages 1140–1148, New York, NY, USA, 2012. ACM. (Citedon pages 96, 97, 99, and 100.)

[153] Benjamin M. Marlin, David C. Kale, Robinder G. Khemani, and Ran-dall C. Wetzel. Unsupervised pattern discovery in electronic health caredata using probabilistic clustering models. In Proceedings of the 2ndACM SIGHIT International Health Informatics Symposium, pages 389–398, New York, NY, USA, 2012. ACM. (Cited on pages 94, 99, 100,and 101.)

[154] Benjamin M Marlin, David C Kale, Robinder G Khemani, and Randall CWetzel. Unsupervised pattern discovery in electronic health care datausing probabilistic clustering models. In Proceedings of the 2nd ACMSIGHIT International Health Informatics Symposium, pages 389–398.ACM, 2012. (Cited on page 108.)

[155] Marvin L Minsky. Logical versus analogical or symbolic versus connec-tionist or neat versus scruffy. AI magazine, 12(2):34, 1991. (Cited onpage 3.)

[156] George B Moody and Roger G Mark. A database to support develop-ment and evaluation of intelligent intensive care monitoring. In Com-puters in Cardiology, 1996, pages 657–660. IEEE, 1996. (Cited on page107.)

[157] Lailil Muflikhah, Yeni Wahyuningsih, et al. Fuzzy rule generation fordiagnosis of coronary heart disease risk using substractive clusteringmethod. 2013. (Cited on page 122.)

[158] Arijit Mukherjee, Arpan Pal, and Prateep Misra. Data analytics in ubiq-uitous sensor-based health information systems. In Proceedings of the2012 Sixth International Conference on Next Generation Mobile Ap-plications, Services and Technologies, pages 193–198, Washington, DC,USA, 2012. IEEE Computer Society. (Cited on page 93.)

[159] Vishal Nangalia, David R Prytherch, and Gary B Smith. Health technol-ogy assessment review: Remote monitoring of vital signs-current statusand future challenges. Critical Care, 14(5):233, 2010. (Cited on page92.)

[160] Alex Nanopoulos, Rob Alcock, and Yannis Manolopoulos. Feature-based classification of time-series data. International Journal of Com-puter Research, 10(3):49–61, 2001. (Cited on page 141.)

[161] Kali Vara Prasad Naraharisetti and Manan Bawa. Comparison of differ-ent signal processing methods for reducing artifacts from photoplethys-mograph signal. In Proceedings of the IEEE International Conference on

180BIBLIOGRAPHY

mining,pages1140–1148,NewYork,NY,USA,2012.ACM.(Citedonpages96,97,99,and100.)

[153]BenjaminM.Marlin,DavidC.Kale,RobinderG.Khemani,andRan-dallC.Wetzel.Unsupervisedpatterndiscoveryinelectronichealthcaredatausingprobabilisticclusteringmodels.InProceedingsofthe2ndACMSIGHITInternationalHealthInformaticsSymposium,pages389–398,NewYork,NY,USA,2012.ACM.(Citedonpages94,99,100,and101.)

[154]BenjaminMMarlin,DavidCKale,RobinderGKhemani,andRandallCWetzel.Unsupervisedpatterndiscoveryinelectronichealthcaredatausingprobabilisticclusteringmodels.InProceedingsofthe2ndACMSIGHITInternationalHealthInformaticsSymposium,pages389–398.ACM,2012.(Citedonpage108.)

[155]MarvinLMinsky.Logicalversusanalogicalorsymbolicversusconnec-tionistorneatversusscruffy.AImagazine,12(2):34,1991.(Citedonpage3.)

[156]GeorgeBMoodyandRogerGMark.Adatabasetosupportdevelop-mentandevaluationofintelligentintensivecaremonitoring.InCom-putersinCardiology,1996,pages657–660.IEEE,1996.(Citedonpage107.)

[157]LaililMuflikhah,YeniWahyuningsih,etal.Fuzzyrulegenerationfordiagnosisofcoronaryheartdiseaseriskusingsubstractiveclusteringmethod.2013.(Citedonpage122.)

[158]ArijitMukherjee,ArpanPal,andPrateepMisra.Dataanalyticsinubiq-uitoussensor-basedhealthinformationsystems.InProceedingsofthe2012SixthInternationalConferenceonNextGenerationMobileAp-plications,ServicesandTechnologies,pages193–198,Washington,DC,USA,2012.IEEEComputerSociety.(Citedonpage93.)

[159]VishalNangalia,DavidRPrytherch,andGaryBSmith.Healthtechnol-ogyassessmentreview:Remotemonitoringofvitalsigns-currentstatusandfuturechallenges.CriticalCare,14(5):233,2010.(Citedonpage92.)

[160]AlexNanopoulos,RobAlcock,andYannisManolopoulos.Feature-basedclassificationoftime-seriesdata.InternationalJournalofCom-puterResearch,10(3):49–61,2001.(Citedonpage141.)

[161]KaliVaraPrasadNaraharisettiandMananBawa.Comparisonofdiffer-entsignalprocessingmethodsforreducingartifactsfromphotoplethys-mographsignal.InProceedingsoftheIEEEInternationalConferenceon

180 BIBLIOGRAPHY

mining, pages 1140–1148, New York, NY, USA, 2012. ACM. (Citedon pages 96, 97, 99, and 100.)

[153] Benjamin M. Marlin, David C. Kale, Robinder G. Khemani, and Ran-dall C. Wetzel. Unsupervised pattern discovery in electronic health caredata using probabilistic clustering models. In Proceedings of the 2ndACM SIGHIT International Health Informatics Symposium, pages 389–398, New York, NY, USA, 2012. ACM. (Cited on pages 94, 99, 100,and 101.)

[154] Benjamin M Marlin, David C Kale, Robinder G Khemani, and Randall CWetzel. Unsupervised pattern discovery in electronic health care datausing probabilistic clustering models. In Proceedings of the 2nd ACMSIGHIT International Health Informatics Symposium, pages 389–398.ACM, 2012. (Cited on page 108.)

[155] Marvin L Minsky. Logical versus analogical or symbolic versus connec-tionist or neat versus scruffy. AI magazine, 12(2):34, 1991. (Cited onpage 3.)

[156] George B Moody and Roger G Mark. A database to support develop-ment and evaluation of intelligent intensive care monitoring. In Com-puters in Cardiology, 1996, pages 657–660. IEEE, 1996. (Cited on page107.)

[157] Lailil Muflikhah, Yeni Wahyuningsih, et al. Fuzzy rule generation fordiagnosis of coronary heart disease risk using substractive clusteringmethod. 2013. (Cited on page 122.)

[158] Arijit Mukherjee, Arpan Pal, and Prateep Misra. Data analytics in ubiq-uitous sensor-based health information systems. In Proceedings of the2012 Sixth International Conference on Next Generation Mobile Ap-plications, Services and Technologies, pages 193–198, Washington, DC,USA, 2012. IEEE Computer Society. (Cited on page 93.)

[159] Vishal Nangalia, David R Prytherch, and Gary B Smith. Health technol-ogy assessment review: Remote monitoring of vital signs-current statusand future challenges. Critical Care, 14(5):233, 2010. (Cited on page92.)

[160] Alex Nanopoulos, Rob Alcock, and Yannis Manolopoulos. Feature-based classification of time-series data. International Journal of Com-puter Research, 10(3):49–61, 2001. (Cited on page 141.)

[161] Kali Vara Prasad Naraharisetti and Manan Bawa. Comparison of differ-ent signal processing methods for reducing artifacts from photoplethys-mograph signal. In Proceedings of the IEEE International Conference on

180BIBLIOGRAPHY

mining,pages1140–1148,NewYork,NY,USA,2012.ACM.(Citedonpages96,97,99,and100.)

[153]BenjaminM.Marlin,DavidC.Kale,RobinderG.Khemani,andRan-dallC.Wetzel.Unsupervisedpatterndiscoveryinelectronichealthcaredatausingprobabilisticclusteringmodels.InProceedingsofthe2ndACMSIGHITInternationalHealthInformaticsSymposium,pages389–398,NewYork,NY,USA,2012.ACM.(Citedonpages94,99,100,and101.)

[154]BenjaminMMarlin,DavidCKale,RobinderGKhemani,andRandallCWetzel.Unsupervisedpatterndiscoveryinelectronichealthcaredatausingprobabilisticclusteringmodels.InProceedingsofthe2ndACMSIGHITInternationalHealthInformaticsSymposium,pages389–398.ACM,2012.(Citedonpage108.)

[155]MarvinLMinsky.Logicalversusanalogicalorsymbolicversusconnec-tionistorneatversusscruffy.AImagazine,12(2):34,1991.(Citedonpage3.)

[156]GeorgeBMoodyandRogerGMark.Adatabasetosupportdevelop-mentandevaluationofintelligentintensivecaremonitoring.InCom-putersinCardiology,1996,pages657–660.IEEE,1996.(Citedonpage107.)

[157]LaililMuflikhah,YeniWahyuningsih,etal.Fuzzyrulegenerationfordiagnosisofcoronaryheartdiseaseriskusingsubstractiveclusteringmethod.2013.(Citedonpage122.)

[158]ArijitMukherjee,ArpanPal,andPrateepMisra.Dataanalyticsinubiq-uitoussensor-basedhealthinformationsystems.InProceedingsofthe2012SixthInternationalConferenceonNextGenerationMobileAp-plications,ServicesandTechnologies,pages193–198,Washington,DC,USA,2012.IEEEComputerSociety.(Citedonpage93.)

[159]VishalNangalia,DavidRPrytherch,andGaryBSmith.Healthtechnol-ogyassessmentreview:Remotemonitoringofvitalsigns-currentstatusandfuturechallenges.CriticalCare,14(5):233,2010.(Citedonpage92.)

[160]AlexNanopoulos,RobAlcock,andYannisManolopoulos.Feature-basedclassificationoftime-seriesdata.InternationalJournalofCom-puterResearch,10(3):49–61,2001.(Citedonpage141.)

[161]KaliVaraPrasadNaraharisettiandMananBawa.Comparisonofdiffer-entsignalprocessingmethodsforreducingartifactsfromphotoplethys-mographsignal.InProceedingsoftheIEEEInternationalConferenceon

BIBLIOGRAPHY 181

Electro/Information Technology (EIT), pages 1–8. IEEE, 2011. (Citedon page 99.)

[162] Andrew Y Ng and Michael I Jordan. On discriminative vs. generativeclassifiers: A comparison of logistic regression and naive bayes. In Ad-vances in neural information processing systems, pages 841–848, 2002.(Cited on page 83.)

[163] Adam Niewiadomski and Izabela Superson. Multi-subject type-2 lin-guistic summaries of relational databases. In Frontiers of Higher OrderFuzzy Sets, pages 167–181. Springer, 2015. (Cited on page 28.)

[164] Nils J Nilsson. Artificial intelligence: a new synthesis. Elsevier, 1998.(Cited on page 20.)

[165] Vilém Novák. Linguistic characterization of time series. Fuzzy Sets andSystems, 285:52–72, 2016. (Cited on pages 63, 139, and 140.)

[166] Miho Ohsaki, Yoshinori Sato, Hideto Yokoi, and Takahira Yamaguchi.A rule discovery support system for sequential medical data, in the casestudy of a chronic hepatitis dataset. In Workshop Notes of the Interna-tional Workshop on Active Mining, at IEEE International Conferenceon Data Mining, 2002. (Cited on page 121.)

[167] Patricia Ordonez, Tom Armstrong, Tim Oates, and Jim Fackler. Classifi-cation of patients using novel multivariate time series representations ofphysiological data. In 10th International Conference on Machine Learn-ing and Applications and Workshops (ICMLA), volume 2, pages 172–179. IEEE, 2011. (Cited on page 98.)

[168] Mukta Paliwal and Usha A Kumar. Neural networks and statisticaltechniques: A review of applications. Expert systems with applications,36(1):2–17, 2009. (Cited on page 97.)

[169] Alexandros Pantelopoulos and Nikolaos G Bourbakis. Prognosis -a wearable health-monitoring system for people at risk: Methodol-ogy and modeling. IEEE Transactions on Information Technology inBiomedicine, 14(3):613–621, 2010. (Cited on page 95.)

[170] Abhilasha M. Patel, Pankaj K. Gakare, and A. N. Cheeran. Real time ecgfeature extraction and arrhythmia detection on a mobile platform. Inter-national Journal of Computer Applications, 44(23):40–45, April 2012.(Cited on pages 99, 100, and 101.)

[171] Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection basedon mutual information criteria of max-dependency, max-relevance, andmin-redundancy. IEEE Transactions on pattern analysis and machineintelligence, 27(8):1226–1238, 2005. (Cited on page 40.)

BIBLIOGRAPHY181

Electro/InformationTechnology(EIT),pages1–8.IEEE,2011.(Citedonpage99.)

[162]AndrewYNgandMichaelIJordan.Ondiscriminativevs.generativeclassifiers:Acomparisonoflogisticregressionandnaivebayes.InAd-vancesinneuralinformationprocessingsystems,pages841–848,2002.(Citedonpage83.)

[163]AdamNiewiadomskiandIzabelaSuperson.Multi-subjecttype-2lin-guisticsummariesofrelationaldatabases.InFrontiersofHigherOrderFuzzySets,pages167–181.Springer,2015.(Citedonpage28.)

[164]NilsJNilsson.Artificialintelligence:anewsynthesis.Elsevier,1998.(Citedonpage20.)

[165]VilémNovák.Linguisticcharacterizationoftimeseries.FuzzySetsandSystems,285:52–72,2016.(Citedonpages63,139,and140.)

[166]MihoOhsaki,YoshinoriSato,HidetoYokoi,andTakahiraYamaguchi.Arulediscoverysupportsystemforsequentialmedicaldata,inthecasestudyofachronichepatitisdataset.InWorkshopNotesoftheInterna-tionalWorkshoponActiveMining,atIEEEInternationalConferenceonDataMining,2002.(Citedonpage121.)

[167]PatriciaOrdonez,TomArmstrong,TimOates,andJimFackler.Classifi-cationofpatientsusingnovelmultivariatetimeseriesrepresentationsofphysiologicaldata.In10thInternationalConferenceonMachineLearn-ingandApplicationsandWorkshops(ICMLA),volume2,pages172–179.IEEE,2011.(Citedonpage98.)

[168]MuktaPaliwalandUshaAKumar.Neuralnetworksandstatisticaltechniques:Areviewofapplications.Expertsystemswithapplications,36(1):2–17,2009.(Citedonpage97.)

[169]AlexandrosPantelopoulosandNikolaosGBourbakis.Prognosis-awearablehealth-monitoringsystemforpeopleatrisk:Methodol-ogyandmodeling.IEEETransactionsonInformationTechnologyinBiomedicine,14(3):613–621,2010.(Citedonpage95.)

[170]AbhilashaM.Patel,PankajK.Gakare,andA.N.Cheeran.Realtimeecgfeatureextractionandarrhythmiadetectiononamobileplatform.Inter-nationalJournalofComputerApplications,44(23):40–45,April2012.(Citedonpages99,100,and101.)

[171]HanchuanPeng,FuhuiLong,andChrisDing.Featureselectionbasedonmutualinformationcriteriaofmax-dependency,max-relevance,andmin-redundancy.IEEETransactionsonpatternanalysisandmachineintelligence,27(8):1226–1238,2005.(Citedonpage40.)

BIBLIOGRAPHY 181

Electro/Information Technology (EIT), pages 1–8. IEEE, 2011. (Citedon page 99.)

[162] Andrew Y Ng and Michael I Jordan. On discriminative vs. generativeclassifiers: A comparison of logistic regression and naive bayes. In Ad-vances in neural information processing systems, pages 841–848, 2002.(Cited on page 83.)

[163] Adam Niewiadomski and Izabela Superson. Multi-subject type-2 lin-guistic summaries of relational databases. In Frontiers of Higher OrderFuzzy Sets, pages 167–181. Springer, 2015. (Cited on page 28.)

[164] Nils J Nilsson. Artificial intelligence: a new synthesis. Elsevier, 1998.(Cited on page 20.)

[165] Vilém Novák. Linguistic characterization of time series. Fuzzy Sets andSystems, 285:52–72, 2016. (Cited on pages 63, 139, and 140.)

[166] Miho Ohsaki, Yoshinori Sato, Hideto Yokoi, and Takahira Yamaguchi.A rule discovery support system for sequential medical data, in the casestudy of a chronic hepatitis dataset. In Workshop Notes of the Interna-tional Workshop on Active Mining, at IEEE International Conferenceon Data Mining, 2002. (Cited on page 121.)

[167] Patricia Ordonez, Tom Armstrong, Tim Oates, and Jim Fackler. Classifi-cation of patients using novel multivariate time series representations ofphysiological data. In 10th International Conference on Machine Learn-ing and Applications and Workshops (ICMLA), volume 2, pages 172–179. IEEE, 2011. (Cited on page 98.)

[168] Mukta Paliwal and Usha A Kumar. Neural networks and statisticaltechniques: A review of applications. Expert systems with applications,36(1):2–17, 2009. (Cited on page 97.)

[169] Alexandros Pantelopoulos and Nikolaos G Bourbakis. Prognosis -a wearable health-monitoring system for people at risk: Methodol-ogy and modeling. IEEE Transactions on Information Technology inBiomedicine, 14(3):613–621, 2010. (Cited on page 95.)

[170] Abhilasha M. Patel, Pankaj K. Gakare, and A. N. Cheeran. Real time ecgfeature extraction and arrhythmia detection on a mobile platform. Inter-national Journal of Computer Applications, 44(23):40–45, April 2012.(Cited on pages 99, 100, and 101.)

[171] Hanchuan Peng, Fuhui Long, and Chris Ding. Feature selection basedon mutual information criteria of max-dependency, max-relevance, andmin-redundancy. IEEE Transactions on pattern analysis and machineintelligence, 27(8):1226–1238, 2005. (Cited on page 40.)

BIBLIOGRAPHY181

Electro/InformationTechnology(EIT),pages1–8.IEEE,2011.(Citedonpage99.)

[162]AndrewYNgandMichaelIJordan.Ondiscriminativevs.generativeclassifiers:Acomparisonoflogisticregressionandnaivebayes.InAd-vancesinneuralinformationprocessingsystems,pages841–848,2002.(Citedonpage83.)

[163]AdamNiewiadomskiandIzabelaSuperson.Multi-subjecttype-2lin-guisticsummariesofrelationaldatabases.InFrontiersofHigherOrderFuzzySets,pages167–181.Springer,2015.(Citedonpage28.)

[164]NilsJNilsson.Artificialintelligence:anewsynthesis.Elsevier,1998.(Citedonpage20.)

[165]VilémNovák.Linguisticcharacterizationoftimeseries.FuzzySetsandSystems,285:52–72,2016.(Citedonpages63,139,and140.)

[166]MihoOhsaki,YoshinoriSato,HidetoYokoi,andTakahiraYamaguchi.Arulediscoverysupportsystemforsequentialmedicaldata,inthecasestudyofachronichepatitisdataset.InWorkshopNotesoftheInterna-tionalWorkshoponActiveMining,atIEEEInternationalConferenceonDataMining,2002.(Citedonpage121.)

[167]PatriciaOrdonez,TomArmstrong,TimOates,andJimFackler.Classifi-cationofpatientsusingnovelmultivariatetimeseriesrepresentationsofphysiologicaldata.In10thInternationalConferenceonMachineLearn-ingandApplicationsandWorkshops(ICMLA),volume2,pages172–179.IEEE,2011.(Citedonpage98.)

[168]MuktaPaliwalandUshaAKumar.Neuralnetworksandstatisticaltechniques:Areviewofapplications.Expertsystemswithapplications,36(1):2–17,2009.(Citedonpage97.)

[169]AlexandrosPantelopoulosandNikolaosGBourbakis.Prognosis-awearablehealth-monitoringsystemforpeopleatrisk:Methodol-ogyandmodeling.IEEETransactionsonInformationTechnologyinBiomedicine,14(3):613–621,2010.(Citedonpage95.)

[170]AbhilashaM.Patel,PankajK.Gakare,andA.N.Cheeran.Realtimeecgfeatureextractionandarrhythmiadetectiononamobileplatform.Inter-nationalJournalofComputerApplications,44(23):40–45,April2012.(Citedonpages99,100,and101.)

[171]HanchuanPeng,FuhuiLong,andChrisDing.Featureselectionbasedonmutualinformationcriteriaofmax-dependency,max-relevance,andmin-redundancy.IEEETransactionsonpatternanalysisandmachineintelligence,27(8):1226–1238,2005.(Citedonpage40.)

182 BIBLIOGRAPHY

[172] François Petitjean, Alain Ketterlin, and Pierre Gançarski. A global aver-aging method for dynamic time warping, with applications to clustering.Pattern Recognition, 44(3):678–693, 2011. (Cited on page 116.)

[173] Vili Podgorelec, Peter Kokol, Bruno Stiglic, and Ivan Rozman. Decisiontrees: an overview and their use in medicine. Journal of medical systems,26(5):445–463, 2002. (Cited on page 98.)

[174] François Portet, Ehud Reiter, Albert Gatt, Jim Hunter, Somayajulu Sri-pada, Yvonne Freer, and Cindy Sykes. Automatic generation of tex-tual summaries from neonatal intensive care data. Artificial Intelligence,173(7-8):789–816, 2009. (Cited on pages 31 and 32.)

[175] Gaurav N. Pradhan, Rita Chattopadhyay, and S. Panchanathan. Process-ing body sensor data streams for continuous physiological monitoring.In Proceedings of the international conference on Multimedia informa-tion retrieval, pages 479–486, New York, NY, USA, 2010. ACM. (Citedon page 97.)

[176] Willard Van Orman Quine. Ontological relativity and other essays.Number 1. Columbia University Press, 1969. (Cited on pages 24and 36.)

[177] Alejandro Ramos-Soto, Alberto Jose Bugarin, and Senén Barro. Fuzzysets across the natural language generation pipeline. Progress in artificialintelligence, 5(4):261–276, 2016. (Cited on pages 56 and 83.)

[178] Alejandro Ramos-Soto, Alberto Jose Bugarin, and Senén Barro. On therole of linguistic descriptions of data in the building of natural languagegeneration systems. Fuzzy Sets and Systems, 285:31–51, 2016. (Citedon pages 27 and 28.)

[179] Alejandro Ramos-Soto, Alberto Jose Bugarin, Senén Barro, and FelixDiaz-Hermida. Automatic linguistic descriptions of meteorological dataa soft computing approach for converting open data to open informa-tion. In 8th Iberian Conference on Information Systems and Technolo-gies (CISTI), pages 1–6. IEEE, 2013. (Cited on page 28.)

[180] Alejandro Ramos-Soto, Alberto Jose Bugarin, Senén Barro, and JuanTaboada. Linguistic descriptions for automatic generation of textualshort-term weather forecasts on real prediction data. IEEE Transactionson Fuzzy Systems, 23(1):44–57, 2015. (Cited on page 83.)

[181] Martin Raubal. Formalizing conceptual spaces. In Formal ontology ininformation systems, proceedings of the third international conference(FOIS 2004), volume 114, pages 153–164, 2004. (Cited on pages 24and 35.)

182BIBLIOGRAPHY

[172]FrançoisPetitjean,AlainKetterlin,andPierreGançarski.Aglobalaver-agingmethodfordynamictimewarping,withapplicationstoclustering.PatternRecognition,44(3):678–693,2011.(Citedonpage116.)

[173]ViliPodgorelec,PeterKokol,BrunoStiglic,andIvanRozman.Decisiontrees:anoverviewandtheiruseinmedicine.Journalofmedicalsystems,26(5):445–463,2002.(Citedonpage98.)

[174]FrançoisPortet,EhudReiter,AlbertGatt,JimHunter,SomayajuluSri-pada,YvonneFreer,andCindySykes.Automaticgenerationoftex-tualsummariesfromneonatalintensivecaredata.ArtificialIntelligence,173(7-8):789–816,2009.(Citedonpages31and32.)

[175]GauravN.Pradhan,RitaChattopadhyay,andS.Panchanathan.Process-ingbodysensordatastreamsforcontinuousphysiologicalmonitoring.InProceedingsoftheinternationalconferenceonMultimediainforma-tionretrieval,pages479–486,NewYork,NY,USA,2010.ACM.(Citedonpage97.)

[176]WillardVanOrmanQuine.Ontologicalrelativityandotheressays.Number1.ColumbiaUniversityPress,1969.(Citedonpages24and36.)

[177]AlejandroRamos-Soto,AlbertoJoseBugarin,andSenénBarro.Fuzzysetsacrossthenaturallanguagegenerationpipeline.Progressinartificialintelligence,5(4):261–276,2016.(Citedonpages56and83.)

[178]AlejandroRamos-Soto,AlbertoJoseBugarin,andSenénBarro.Ontheroleoflinguisticdescriptionsofdatainthebuildingofnaturallanguagegenerationsystems.FuzzySetsandSystems,285:31–51,2016.(Citedonpages27and28.)

[179]AlejandroRamos-Soto,AlbertoJoseBugarin,SenénBarro,andFelixDiaz-Hermida.Automaticlinguisticdescriptionsofmeteorologicaldataasoftcomputingapproachforconvertingopendatatoopeninforma-tion.In8thIberianConferenceonInformationSystemsandTechnolo-gies(CISTI),pages1–6.IEEE,2013.(Citedonpage28.)

[180]AlejandroRamos-Soto,AlbertoJoseBugarin,SenénBarro,andJuanTaboada.Linguisticdescriptionsforautomaticgenerationoftextualshort-termweatherforecastsonrealpredictiondata.IEEETransactionsonFuzzySystems,23(1):44–57,2015.(Citedonpage83.)

[181]MartinRaubal.Formalizingconceptualspaces.InFormalontologyininformationsystems,proceedingsofthethirdinternationalconference(FOIS2004),volume114,pages153–164,2004.(Citedonpages24and35.)

182 BIBLIOGRAPHY

[172] François Petitjean, Alain Ketterlin, and Pierre Gançarski. A global aver-aging method for dynamic time warping, with applications to clustering.Pattern Recognition, 44(3):678–693, 2011. (Cited on page 116.)

[173] Vili Podgorelec, Peter Kokol, Bruno Stiglic, and Ivan Rozman. Decisiontrees: an overview and their use in medicine. Journal of medical systems,26(5):445–463, 2002. (Cited on page 98.)

[174] François Portet, Ehud Reiter, Albert Gatt, Jim Hunter, Somayajulu Sri-pada, Yvonne Freer, and Cindy Sykes. Automatic generation of tex-tual summaries from neonatal intensive care data. Artificial Intelligence,173(7-8):789–816, 2009. (Cited on pages 31 and 32.)

[175] Gaurav N. Pradhan, Rita Chattopadhyay, and S. Panchanathan. Process-ing body sensor data streams for continuous physiological monitoring.In Proceedings of the international conference on Multimedia informa-tion retrieval, pages 479–486, New York, NY, USA, 2010. ACM. (Citedon page 97.)

[176] Willard Van Orman Quine. Ontological relativity and other essays.Number 1. Columbia University Press, 1969. (Cited on pages 24and 36.)

[177] Alejandro Ramos-Soto, Alberto Jose Bugarin, and Senén Barro. Fuzzysets across the natural language generation pipeline. Progress in artificialintelligence, 5(4):261–276, 2016. (Cited on pages 56 and 83.)

[178] Alejandro Ramos-Soto, Alberto Jose Bugarin, and Senén Barro. On therole of linguistic descriptions of data in the building of natural languagegeneration systems. Fuzzy Sets and Systems, 285:31–51, 2016. (Citedon pages 27 and 28.)

[179] Alejandro Ramos-Soto, Alberto Jose Bugarin, Senén Barro, and FelixDiaz-Hermida. Automatic linguistic descriptions of meteorological dataa soft computing approach for converting open data to open informa-tion. In 8th Iberian Conference on Information Systems and Technolo-gies (CISTI), pages 1–6. IEEE, 2013. (Cited on page 28.)

[180] Alejandro Ramos-Soto, Alberto Jose Bugarin, Senén Barro, and JuanTaboada. Linguistic descriptions for automatic generation of textualshort-term weather forecasts on real prediction data. IEEE Transactionson Fuzzy Systems, 23(1):44–57, 2015. (Cited on page 83.)

[181] Martin Raubal. Formalizing conceptual spaces. In Formal ontology ininformation systems, proceedings of the third international conference(FOIS 2004), volume 114, pages 153–164, 2004. (Cited on pages 24and 35.)

182BIBLIOGRAPHY

[172]FrançoisPetitjean,AlainKetterlin,andPierreGançarski.Aglobalaver-agingmethodfordynamictimewarping,withapplicationstoclustering.PatternRecognition,44(3):678–693,2011.(Citedonpage116.)

[173]ViliPodgorelec,PeterKokol,BrunoStiglic,andIvanRozman.Decisiontrees:anoverviewandtheiruseinmedicine.Journalofmedicalsystems,26(5):445–463,2002.(Citedonpage98.)

[174]FrançoisPortet,EhudReiter,AlbertGatt,JimHunter,SomayajuluSri-pada,YvonneFreer,andCindySykes.Automaticgenerationoftex-tualsummariesfromneonatalintensivecaredata.ArtificialIntelligence,173(7-8):789–816,2009.(Citedonpages31and32.)

[175]GauravN.Pradhan,RitaChattopadhyay,andS.Panchanathan.Process-ingbodysensordatastreamsforcontinuousphysiologicalmonitoring.InProceedingsoftheinternationalconferenceonMultimediainforma-tionretrieval,pages479–486,NewYork,NY,USA,2010.ACM.(Citedonpage97.)

[176]WillardVanOrmanQuine.Ontologicalrelativityandotheressays.Number1.ColumbiaUniversityPress,1969.(Citedonpages24and36.)

[177]AlejandroRamos-Soto,AlbertoJoseBugarin,andSenénBarro.Fuzzysetsacrossthenaturallanguagegenerationpipeline.Progressinartificialintelligence,5(4):261–276,2016.(Citedonpages56and83.)

[178]AlejandroRamos-Soto,AlbertoJoseBugarin,andSenénBarro.Ontheroleoflinguisticdescriptionsofdatainthebuildingofnaturallanguagegenerationsystems.FuzzySetsandSystems,285:31–51,2016.(Citedonpages27and28.)

[179]AlejandroRamos-Soto,AlbertoJoseBugarin,SenénBarro,andFelixDiaz-Hermida.Automaticlinguisticdescriptionsofmeteorologicaldataasoftcomputingapproachforconvertingopendatatoopeninforma-tion.In8thIberianConferenceonInformationSystemsandTechnolo-gies(CISTI),pages1–6.IEEE,2013.(Citedonpage28.)

[180]AlejandroRamos-Soto,AlbertoJoseBugarin,SenénBarro,andJuanTaboada.Linguisticdescriptionsforautomaticgenerationoftextualshort-termweatherforecastsonrealpredictiondata.IEEETransactionsonFuzzySystems,23(1):44–57,2015.(Citedonpage83.)

[181]MartinRaubal.Formalizingconceptualspaces.InFormalontologyininformationsystems,proceedingsofthethirdinternationalconference(FOIS2004),volume114,pages153–164,2004.(Citedonpages24and35.)

BIBLIOGRAPHY 183

[182] Ehud Reiter. An architecture for data-to-text systems. In Proceedingsof the Eleventh European Workshop on Natural Language Generation,pages 97–104, 2007. (Cited on pages 30, 31, 32, 57, and 65.)

[183] Ehud Reiter and Anja Belz. An investigation into the validity of somemetrics for automatically evaluating natural language generation sys-tems. Computational Linguistics, 35(4):529–558, 2009. (Cited on page80.)

[184] Ehud Reiter and Robert Dale. Building applied natural language genera-tion systems. Natural Language Engineering, 3(01):57–87, 1997. (Citedon pages 29 and 69.)

[185] Ehud Reiter, Robert Dale, and Zhiwei Feng. Building natural languagegeneration systems, volume 33. MIT Press, 2000. (Cited on pages 29,56, 68, 69, 70, and 132.)

[186] Ehud Reiter, Somayajulu Sripada, Jim Hunter, Jin Yu, and Ian Davy.Choosing words in computer-generated weather forecasts. Artificial In-telligence, 167(1):137–169, 2005. (Cited on pages 31 and 32.)

[187] Ehud Reiter, Somayajulu G Sripada, and Roma Robertson. Acquiringcorrect knowledge for natural language generation. Journal of Artifi-cial Intelligence Research, pages 491–516, 2003. (Cited on pages 29and 31.)

[188] John T Rickard. A concept geometry for conceptual spaces. Fuzzy opti-mization and decision making, 5(4):311–329, 2006. (Cited on pages 4,24, 26, 62, and 66.)

[189] John T Rickard, Janet Aisbett, and Greg Gibbon. Reformulation of thetheory of conceptual spaces. Information Sciences, 177(21):4539–4565,2007. (Cited on pages 4, 24, 35, and 56.)

[190] Joanne K Rowling. Harry Potter And The Prisoner Of Azkaban. NewYork: Arthur A. Levine Books, 1999. (Cited on page 2.)

[191] Jalal al-Din Rumi and Arthur John Arberry. Tales from the Masnavi.Curzon paperbacks. Curzon Press, 1993. (Cited on page 1.)

[192] Lucia Sacchi, Riccardo Bellazzi, Cristiana Larizza, Riccardo Porreca, andPaolo Magni. Learning rules with complex temporal patterns in biomed-ical domains. In Conference on Artificial Intelligence in Medicine in Eu-rope, pages 23–32. Springer, 2005. (Cited on page 121.)

[193] Lucia Sacchi, Cristiana Larizza, Carlo Combi, and Riccardo Bellazzi.Data mining with temporal abstractions: learning rules from time series.Data Mining and Knowledge Discovery, 15(2):217–247, 2007. (Citedon page 121.)

BIBLIOGRAPHY183

[182]EhudReiter.Anarchitecturefordata-to-textsystems.InProceedingsoftheEleventhEuropeanWorkshoponNaturalLanguageGeneration,pages97–104,2007.(Citedonpages30,31,32,57,and65.)

[183]EhudReiterandAnjaBelz.Aninvestigationintothevalidityofsomemetricsforautomaticallyevaluatingnaturallanguagegenerationsys-tems.ComputationalLinguistics,35(4):529–558,2009.(Citedonpage80.)

[184]EhudReiterandRobertDale.Buildingappliednaturallanguagegenera-tionsystems.NaturalLanguageEngineering,3(01):57–87,1997.(Citedonpages29and69.)

[185]EhudReiter,RobertDale,andZhiweiFeng.Buildingnaturallanguagegenerationsystems,volume33.MITPress,2000.(Citedonpages29,56,68,69,70,and132.)

[186]EhudReiter,SomayajuluSripada,JimHunter,JinYu,andIanDavy.Choosingwordsincomputer-generatedweatherforecasts.ArtificialIn-telligence,167(1):137–169,2005.(Citedonpages31and32.)

[187]EhudReiter,SomayajuluGSripada,andRomaRobertson.Acquiringcorrectknowledgefornaturallanguagegeneration.JournalofArtifi-cialIntelligenceResearch,pages491–516,2003.(Citedonpages29and31.)

[188]JohnTRickard.Aconceptgeometryforconceptualspaces.Fuzzyopti-mizationanddecisionmaking,5(4):311–329,2006.(Citedonpages4,24,26,62,and66.)

[189]JohnTRickard,JanetAisbett,andGregGibbon.Reformulationofthetheoryofconceptualspaces.InformationSciences,177(21):4539–4565,2007.(Citedonpages4,24,35,and56.)

[190]JoanneKRowling.HarryPotterAndThePrisonerOfAzkaban.NewYork:ArthurA.LevineBooks,1999.(Citedonpage2.)

[191]Jalalal-DinRumiandArthurJohnArberry.TalesfromtheMasnavi.Curzonpaperbacks.CurzonPress,1993.(Citedonpage1.)

[192]LuciaSacchi,RiccardoBellazzi,CristianaLarizza,RiccardoPorreca,andPaoloMagni.Learningruleswithcomplextemporalpatternsinbiomed-icaldomains.InConferenceonArtificialIntelligenceinMedicineinEu-rope,pages23–32.Springer,2005.(Citedonpage121.)

[193]LuciaSacchi,CristianaLarizza,CarloCombi,andRiccardoBellazzi.Dataminingwithtemporalabstractions:learningrulesfromtimeseries.DataMiningandKnowledgeDiscovery,15(2):217–247,2007.(Citedonpage121.)

BIBLIOGRAPHY 183

[182] Ehud Reiter. An architecture for data-to-text systems. In Proceedingsof the Eleventh European Workshop on Natural Language Generation,pages 97–104, 2007. (Cited on pages 30, 31, 32, 57, and 65.)

[183] Ehud Reiter and Anja Belz. An investigation into the validity of somemetrics for automatically evaluating natural language generation sys-tems. Computational Linguistics, 35(4):529–558, 2009. (Cited on page80.)

[184] Ehud Reiter and Robert Dale. Building applied natural language genera-tion systems. Natural Language Engineering, 3(01):57–87, 1997. (Citedon pages 29 and 69.)

[185] Ehud Reiter, Robert Dale, and Zhiwei Feng. Building natural languagegeneration systems, volume 33. MIT Press, 2000. (Cited on pages 29,56, 68, 69, 70, and 132.)

[186] Ehud Reiter, Somayajulu Sripada, Jim Hunter, Jin Yu, and Ian Davy.Choosing words in computer-generated weather forecasts. Artificial In-telligence, 167(1):137–169, 2005. (Cited on pages 31 and 32.)

[187] Ehud Reiter, Somayajulu G Sripada, and Roma Robertson. Acquiringcorrect knowledge for natural language generation. Journal of Artifi-cial Intelligence Research, pages 491–516, 2003. (Cited on pages 29and 31.)

[188] John T Rickard. A concept geometry for conceptual spaces. Fuzzy opti-mization and decision making, 5(4):311–329, 2006. (Cited on pages 4,24, 26, 62, and 66.)

[189] John T Rickard, Janet Aisbett, and Greg Gibbon. Reformulation of thetheory of conceptual spaces. Information Sciences, 177(21):4539–4565,2007. (Cited on pages 4, 24, 35, and 56.)

[190] Joanne K Rowling. Harry Potter And The Prisoner Of Azkaban. NewYork: Arthur A. Levine Books, 1999. (Cited on page 2.)

[191] Jalal al-Din Rumi and Arthur John Arberry. Tales from the Masnavi.Curzon paperbacks. Curzon Press, 1993. (Cited on page 1.)

[192] Lucia Sacchi, Riccardo Bellazzi, Cristiana Larizza, Riccardo Porreca, andPaolo Magni. Learning rules with complex temporal patterns in biomed-ical domains. In Conference on Artificial Intelligence in Medicine in Eu-rope, pages 23–32. Springer, 2005. (Cited on page 121.)

[193] Lucia Sacchi, Cristiana Larizza, Carlo Combi, and Riccardo Bellazzi.Data mining with temporal abstractions: learning rules from time series.Data Mining and Knowledge Discovery, 15(2):217–247, 2007. (Citedon page 121.)

BIBLIOGRAPHY183

[182]EhudReiter.Anarchitecturefordata-to-textsystems.InProceedingsoftheEleventhEuropeanWorkshoponNaturalLanguageGeneration,pages97–104,2007.(Citedonpages30,31,32,57,and65.)

[183]EhudReiterandAnjaBelz.Aninvestigationintothevalidityofsomemetricsforautomaticallyevaluatingnaturallanguagegenerationsys-tems.ComputationalLinguistics,35(4):529–558,2009.(Citedonpage80.)

[184]EhudReiterandRobertDale.Buildingappliednaturallanguagegenera-tionsystems.NaturalLanguageEngineering,3(01):57–87,1997.(Citedonpages29and69.)

[185]EhudReiter,RobertDale,andZhiweiFeng.Buildingnaturallanguagegenerationsystems,volume33.MITPress,2000.(Citedonpages29,56,68,69,70,and132.)

[186]EhudReiter,SomayajuluSripada,JimHunter,JinYu,andIanDavy.Choosingwordsincomputer-generatedweatherforecasts.ArtificialIn-telligence,167(1):137–169,2005.(Citedonpages31and32.)

[187]EhudReiter,SomayajuluGSripada,andRomaRobertson.Acquiringcorrectknowledgefornaturallanguagegeneration.JournalofArtifi-cialIntelligenceResearch,pages491–516,2003.(Citedonpages29and31.)

[188]JohnTRickard.Aconceptgeometryforconceptualspaces.Fuzzyopti-mizationanddecisionmaking,5(4):311–329,2006.(Citedonpages4,24,26,62,and66.)

[189]JohnTRickard,JanetAisbett,andGregGibbon.Reformulationofthetheoryofconceptualspaces.InformationSciences,177(21):4539–4565,2007.(Citedonpages4,24,35,and56.)

[190]JoanneKRowling.HarryPotterAndThePrisonerOfAzkaban.NewYork:ArthurA.LevineBooks,1999.(Citedonpage2.)

[191]Jalalal-DinRumiandArthurJohnArberry.TalesfromtheMasnavi.Curzonpaperbacks.CurzonPress,1993.(Citedonpage1.)

[192]LuciaSacchi,RiccardoBellazzi,CristianaLarizza,RiccardoPorreca,andPaoloMagni.Learningruleswithcomplextemporalpatternsinbiomed-icaldomains.InConferenceonArtificialIntelligenceinMedicineinEu-rope,pages23–32.Springer,2005.(Citedonpage121.)

[193]LuciaSacchi,CristianaLarizza,CarloCombi,andRiccardoBellazzi.Dataminingwithtemporalabstractions:learningrulesfromtimeseries.DataMiningandKnowledgeDiscovery,15(2):217–247,2007.(Citedonpage121.)

184 BIBLIOGRAPHY

[194] Ehssan Sakhaee. Elephant in the dark. Available online: https://www.youtube.com/watch?v=iQFjmTEYxs8. (Accessed March 3, 2018).(Cited on pages xi and 163.)

[195] Osman Salem, Yaning Liu, and Ahmed Mehaoua. A lightweight anomalydetection framework for medical wireless sensor networks. In Proceed-ings of the IEEE Wireless Communications and Networking Conference,pages 4358–4363. IEEE, 2013. (Cited on pages 98 and 99.)

[196] Erik Schaffernicht, Robert Kaltenhaeuser, Saurabh Shekhar Verma, andHorst-Michael Gross. On estimating mutual information for featureselection. In International Conference on Artificial Neural Networks,pages 362–367. Springer, 2010. (Cited on page 40.)

[197] Tim Schlüter and Stefan Conrad. About the analysis of time serieswith temporal association rule mining. In IEEE Symposium on Com-putational Intelligence and Data Mining (CIDM), pages 325–332. IEEE,2011. (Cited on page 120.)

[198] Steven Schockaert and Henri Prade. Interpolative and extrapolative rea-soning in propositional theories using qualitative knowledge about con-ceptual spaces. Artificial Intelligence, 202:86–131, 2013. (Cited on page25.)

[199] Angela Schwering and Martin Raubal. Spatial relations for semantic sim-ilarity measurement. In International Conference on Conceptual Mod-eling, pages 259–269. Springer, 2005. (Cited on page 25.)

[200] Christopher G Scully, Jinseok Lee, Joseph Meyer, Alexander M Gorbach,Domhnull Granquist-Fraser, Yitzhak Mendelson, and Ki H Chon. Phys-iological parameter monitoring from optical recordings with a mobilephone. IEEE Transactions on Biomedical Engineering, 59(2):303–306,2012. (Cited on pages 99 and 101.)

[201] Yuval Shahar. A framework for knowledge-based temporal abstraction.Artificial intelligence, 90(1-2):79–133, 1997. (Cited on page 122.)

[202] Roger N Shepard et al. Toward a universal law of generalization forpsychological science. Science, 237(4820):1317–1323, 1987. (Cited onpage 62.)

[203] Robert H Shumway and David S Stoffer. Time series analysis and its ap-plications: with R examples. Springer Science & Business Media, 2010.(Cited on page 141.)

[204] Carina Silberer, Vittorio Ferrari, and Mirella Lapata. Models of semanticrepresentation with visual attributes. In Proceedings of the 51st Annual

184BIBLIOGRAPHY

[194]EhssanSakhaee.Elephantinthedark.Availableonline:https://www.youtube.com/watch?v=iQFjmTEYxs8.(AccessedMarch3,2018).(Citedonpagesxiand163.)

[195]OsmanSalem,YaningLiu,andAhmedMehaoua.Alightweightanomalydetectionframeworkformedicalwirelesssensornetworks.InProceed-ingsoftheIEEEWirelessCommunicationsandNetworkingConference,pages4358–4363.IEEE,2013.(Citedonpages98and99.)

[196]ErikSchaffernicht,RobertKaltenhaeuser,SaurabhShekharVerma,andHorst-MichaelGross.Onestimatingmutualinformationforfeatureselection.InInternationalConferenceonArtificialNeuralNetworks,pages362–367.Springer,2010.(Citedonpage40.)

[197]TimSchlüterandStefanConrad.Abouttheanalysisoftimeserieswithtemporalassociationrulemining.InIEEESymposiumonCom-putationalIntelligenceandDataMining(CIDM),pages325–332.IEEE,2011.(Citedonpage120.)

[198]StevenSchockaertandHenriPrade.Interpolativeandextrapolativerea-soninginpropositionaltheoriesusingqualitativeknowledgeaboutcon-ceptualspaces.ArtificialIntelligence,202:86–131,2013.(Citedonpage25.)

[199]AngelaSchweringandMartinRaubal.Spatialrelationsforsemanticsim-ilaritymeasurement.InInternationalConferenceonConceptualMod-eling,pages259–269.Springer,2005.(Citedonpage25.)

[200]ChristopherGScully,JinseokLee,JosephMeyer,AlexanderMGorbach,DomhnullGranquist-Fraser,YitzhakMendelson,andKiHChon.Phys-iologicalparametermonitoringfromopticalrecordingswithamobilephone.IEEETransactionsonBiomedicalEngineering,59(2):303–306,2012.(Citedonpages99and101.)

[201]YuvalShahar.Aframeworkforknowledge-basedtemporalabstraction.Artificialintelligence,90(1-2):79–133,1997.(Citedonpage122.)

[202]RogerNShepardetal.Towardauniversallawofgeneralizationforpsychologicalscience.Science,237(4820):1317–1323,1987.(Citedonpage62.)

[203]RobertHShumwayandDavidSStoffer.Timeseriesanalysisanditsap-plications:withRexamples.SpringerScience&BusinessMedia,2010.(Citedonpage141.)

[204]CarinaSilberer,VittorioFerrari,andMirellaLapata.Modelsofsemanticrepresentationwithvisualattributes.InProceedingsofthe51stAnnual

184 BIBLIOGRAPHY

[194] Ehssan Sakhaee. Elephant in the dark. Available online: https://www.youtube.com/watch?v=iQFjmTEYxs8. (Accessed March 3, 2018).(Cited on pages xi and 163.)

[195] Osman Salem, Yaning Liu, and Ahmed Mehaoua. A lightweight anomalydetection framework for medical wireless sensor networks. In Proceed-ings of the IEEE Wireless Communications and Networking Conference,pages 4358–4363. IEEE, 2013. (Cited on pages 98 and 99.)

[196] Erik Schaffernicht, Robert Kaltenhaeuser, Saurabh Shekhar Verma, andHorst-Michael Gross. On estimating mutual information for featureselection. In International Conference on Artificial Neural Networks,pages 362–367. Springer, 2010. (Cited on page 40.)

[197] Tim Schlüter and Stefan Conrad. About the analysis of time serieswith temporal association rule mining. In IEEE Symposium on Com-putational Intelligence and Data Mining (CIDM), pages 325–332. IEEE,2011. (Cited on page 120.)

[198] Steven Schockaert and Henri Prade. Interpolative and extrapolative rea-soning in propositional theories using qualitative knowledge about con-ceptual spaces. Artificial Intelligence, 202:86–131, 2013. (Cited on page25.)

[199] Angela Schwering and Martin Raubal. Spatial relations for semantic sim-ilarity measurement. In International Conference on Conceptual Mod-eling, pages 259–269. Springer, 2005. (Cited on page 25.)

[200] Christopher G Scully, Jinseok Lee, Joseph Meyer, Alexander M Gorbach,Domhnull Granquist-Fraser, Yitzhak Mendelson, and Ki H Chon. Phys-iological parameter monitoring from optical recordings with a mobilephone. IEEE Transactions on Biomedical Engineering, 59(2):303–306,2012. (Cited on pages 99 and 101.)

[201] Yuval Shahar. A framework for knowledge-based temporal abstraction.Artificial intelligence, 90(1-2):79–133, 1997. (Cited on page 122.)

[202] Roger N Shepard et al. Toward a universal law of generalization forpsychological science. Science, 237(4820):1317–1323, 1987. (Cited onpage 62.)

[203] Robert H Shumway and David S Stoffer. Time series analysis and its ap-plications: with R examples. Springer Science & Business Media, 2010.(Cited on page 141.)

[204] Carina Silberer, Vittorio Ferrari, and Mirella Lapata. Models of semanticrepresentation with visual attributes. In Proceedings of the 51st Annual

184BIBLIOGRAPHY

[194]EhssanSakhaee.Elephantinthedark.Availableonline:https://www.youtube.com/watch?v=iQFjmTEYxs8.(AccessedMarch3,2018).(Citedonpagesxiand163.)

[195]OsmanSalem,YaningLiu,andAhmedMehaoua.Alightweightanomalydetectionframeworkformedicalwirelesssensornetworks.InProceed-ingsoftheIEEEWirelessCommunicationsandNetworkingConference,pages4358–4363.IEEE,2013.(Citedonpages98and99.)

[196]ErikSchaffernicht,RobertKaltenhaeuser,SaurabhShekharVerma,andHorst-MichaelGross.Onestimatingmutualinformationforfeatureselection.InInternationalConferenceonArtificialNeuralNetworks,pages362–367.Springer,2010.(Citedonpage40.)

[197]TimSchlüterandStefanConrad.Abouttheanalysisoftimeserieswithtemporalassociationrulemining.InIEEESymposiumonCom-putationalIntelligenceandDataMining(CIDM),pages325–332.IEEE,2011.(Citedonpage120.)

[198]StevenSchockaertandHenriPrade.Interpolativeandextrapolativerea-soninginpropositionaltheoriesusingqualitativeknowledgeaboutcon-ceptualspaces.ArtificialIntelligence,202:86–131,2013.(Citedonpage25.)

[199]AngelaSchweringandMartinRaubal.Spatialrelationsforsemanticsim-ilaritymeasurement.InInternationalConferenceonConceptualMod-eling,pages259–269.Springer,2005.(Citedonpage25.)

[200]ChristopherGScully,JinseokLee,JosephMeyer,AlexanderMGorbach,DomhnullGranquist-Fraser,YitzhakMendelson,andKiHChon.Phys-iologicalparametermonitoringfromopticalrecordingswithamobilephone.IEEETransactionsonBiomedicalEngineering,59(2):303–306,2012.(Citedonpages99and101.)

[201]YuvalShahar.Aframeworkforknowledge-basedtemporalabstraction.Artificialintelligence,90(1-2):79–133,1997.(Citedonpage122.)

[202]RogerNShepardetal.Towardauniversallawofgeneralizationforpsychologicalscience.Science,237(4820):1317–1323,1987.(Citedonpage62.)

[203]RobertHShumwayandDavidSStoffer.Timeseriesanalysisanditsap-plications:withRexamples.SpringerScience&BusinessMedia,2010.(Citedonpage141.)

[204]CarinaSilberer,VittorioFerrari,andMirellaLapata.Modelsofsemanticrepresentationwithvisualattributes.InProceedingsofthe51stAnnual

BIBLIOGRAPHY 185

Meeting of the Association for Computational Linguistics (ACL), vol-ume 1, pages 572–582, 2013. (Cited on page 160.)

[205] Fábio Silva, Teresa Olivares, Fernando Royo, MA Vergara, and CesarAnalide. Experimental study of the stress level at the workplace us-ing an smart testbed of wireless sensor networks and ambient intelli-gence techniques. In International Work-Conference on the InterplayBetween Natural and Artificial Computation, pages 200–209. Springer,2013. (Cited on pages 94 and 99.)

[206] Pedro FB Silva, André RS Marçal, and Rubim M Almeida da Silva. Eval-uation of features for leaf discrimination. In Image Analysis and Recog-nition, pages 197–204. 2013. (Cited on pages 36, 73, 74, and 75.)

[207] Craig Silverstein, Sergey Brin, and Rajeev Motwani. Beyond market bas-kets: Generalizing association rules to dependence rules. Data miningand knowledge discovery, 2(1):39–68, 1998. (Cited on page 133.)

[208] Rajiv Ranjan Singh, Sailesh Conjeti, and Rahul Banerjee. An approachfor real-time stress-trend detection using physiological signals in wear-able computing systems for automotive drivers. In 14th InternationalIEEE Conference on Intelligent Transportation Systems (ITSC), pages1477–1482. IEEE, 2011. (Cited on pages 96, 97, 99, and 100.)

[209] Linda B Smith. From global similarities to kinds of similarities: the con-struction of dimensions in development. In Similarity and analogicalreasoning, pages 146–178. Cambridge University Press, 1989. (Citedon page 25.)

[210] Sweta Sneha and Upkar Varshney. Enabling ubiquitous patient monitor-ing: Model, decision protocols, opportunities and challenges. DecisionSupport Systems, 46(3):606–619, 2009. (Cited on page 95.)

[211] Mohammad S Sorower. A literature survey on algorithms for multi-labellearning. Oregon State University, Corvallis, 2010. (Cited on page 83.)

[212] Daby Sow, Deepak S. Turaga, and Michael Schmidt. Mining of sensordata in healthcare: A survey. In Charu C. Aggarwal, editor, Managingand Mining Sensor Data, pages 459–504. Springer US, 2013. (Cited onpages 92, 93, 96, and 97.)

[213] Daby Sow, Deepak S Turaga, and Michael Schmidt. Mining of sensordata in healthcare: a survey. In Managing and mining sensor data, pages459–504. Springer, 2013. (Cited on page 105.)

[214] Thomas L Spalding and Brian H Ross. Concept learning and featureinterpretation. Memory & cognition, 28(3):439–451, 2000. (Cited onpage 50.)

BIBLIOGRAPHY185

MeetingoftheAssociationforComputationalLinguistics(ACL),vol-ume1,pages572–582,2013.(Citedonpage160.)

[205]FábioSilva,TeresaOlivares,FernandoRoyo,MAVergara,andCesarAnalide.Experimentalstudyofthestresslevelattheworkplaceus-ingansmarttestbedofwirelesssensornetworksandambientintelli-gencetechniques.InInternationalWork-ConferenceontheInterplayBetweenNaturalandArtificialComputation,pages200–209.Springer,2013.(Citedonpages94and99.)

[206]PedroFBSilva,AndréRSMarçal,andRubimMAlmeidadaSilva.Eval-uationoffeaturesforleafdiscrimination.InImageAnalysisandRecog-nition,pages197–204.2013.(Citedonpages36,73,74,and75.)

[207]CraigSilverstein,SergeyBrin,andRajeevMotwani.Beyondmarketbas-kets:Generalizingassociationrulestodependencerules.Dataminingandknowledgediscovery,2(1):39–68,1998.(Citedonpage133.)

[208]RajivRanjanSingh,SaileshConjeti,andRahulBanerjee.Anapproachforreal-timestress-trenddetectionusingphysiologicalsignalsinwear-ablecomputingsystemsforautomotivedrivers.In14thInternationalIEEEConferenceonIntelligentTransportationSystems(ITSC),pages1477–1482.IEEE,2011.(Citedonpages96,97,99,and100.)

[209]LindaBSmith.Fromglobalsimilaritiestokindsofsimilarities:thecon-structionofdimensionsindevelopment.InSimilarityandanalogicalreasoning,pages146–178.CambridgeUniversityPress,1989.(Citedonpage25.)

[210]SwetaSnehaandUpkarVarshney.Enablingubiquitouspatientmonitor-ing:Model,decisionprotocols,opportunitiesandchallenges.DecisionSupportSystems,46(3):606–619,2009.(Citedonpage95.)

[211]MohammadSSorower.Aliteraturesurveyonalgorithmsformulti-labellearning.OregonStateUniversity,Corvallis,2010.(Citedonpage83.)

[212]DabySow,DeepakS.Turaga,andMichaelSchmidt.Miningofsensordatainhealthcare:Asurvey.InCharuC.Aggarwal,editor,ManagingandMiningSensorData,pages459–504.SpringerUS,2013.(Citedonpages92,93,96,and97.)

[213]DabySow,DeepakSTuraga,andMichaelSchmidt.Miningofsensordatainhealthcare:asurvey.InManagingandminingsensordata,pages459–504.Springer,2013.(Citedonpage105.)

[214]ThomasLSpaldingandBrianHRoss.Conceptlearningandfeatureinterpretation.Memory&cognition,28(3):439–451,2000.(Citedonpage50.)

BIBLIOGRAPHY 185

Meeting of the Association for Computational Linguistics (ACL), vol-ume 1, pages 572–582, 2013. (Cited on page 160.)

[205] Fábio Silva, Teresa Olivares, Fernando Royo, MA Vergara, and CesarAnalide. Experimental study of the stress level at the workplace us-ing an smart testbed of wireless sensor networks and ambient intelli-gence techniques. In International Work-Conference on the InterplayBetween Natural and Artificial Computation, pages 200–209. Springer,2013. (Cited on pages 94 and 99.)

[206] Pedro FB Silva, André RS Marçal, and Rubim M Almeida da Silva. Eval-uation of features for leaf discrimination. In Image Analysis and Recog-nition, pages 197–204. 2013. (Cited on pages 36, 73, 74, and 75.)

[207] Craig Silverstein, Sergey Brin, and Rajeev Motwani. Beyond market bas-kets: Generalizing association rules to dependence rules. Data miningand knowledge discovery, 2(1):39–68, 1998. (Cited on page 133.)

[208] Rajiv Ranjan Singh, Sailesh Conjeti, and Rahul Banerjee. An approachfor real-time stress-trend detection using physiological signals in wear-able computing systems for automotive drivers. In 14th InternationalIEEE Conference on Intelligent Transportation Systems (ITSC), pages1477–1482. IEEE, 2011. (Cited on pages 96, 97, 99, and 100.)

[209] Linda B Smith. From global similarities to kinds of similarities: the con-struction of dimensions in development. In Similarity and analogicalreasoning, pages 146–178. Cambridge University Press, 1989. (Citedon page 25.)

[210] Sweta Sneha and Upkar Varshney. Enabling ubiquitous patient monitor-ing: Model, decision protocols, opportunities and challenges. DecisionSupport Systems, 46(3):606–619, 2009. (Cited on page 95.)

[211] Mohammad S Sorower. A literature survey on algorithms for multi-labellearning. Oregon State University, Corvallis, 2010. (Cited on page 83.)

[212] Daby Sow, Deepak S. Turaga, and Michael Schmidt. Mining of sensordata in healthcare: A survey. In Charu C. Aggarwal, editor, Managingand Mining Sensor Data, pages 459–504. Springer US, 2013. (Cited onpages 92, 93, 96, and 97.)

[213] Daby Sow, Deepak S Turaga, and Michael Schmidt. Mining of sensordata in healthcare: a survey. In Managing and mining sensor data, pages459–504. Springer, 2013. (Cited on page 105.)

[214] Thomas L Spalding and Brian H Ross. Concept learning and featureinterpretation. Memory & cognition, 28(3):439–451, 2000. (Cited onpage 50.)

BIBLIOGRAPHY185

MeetingoftheAssociationforComputationalLinguistics(ACL),vol-ume1,pages572–582,2013.(Citedonpage160.)

[205]FábioSilva,TeresaOlivares,FernandoRoyo,MAVergara,andCesarAnalide.Experimentalstudyofthestresslevelattheworkplaceus-ingansmarttestbedofwirelesssensornetworksandambientintelli-gencetechniques.InInternationalWork-ConferenceontheInterplayBetweenNaturalandArtificialComputation,pages200–209.Springer,2013.(Citedonpages94and99.)

[206]PedroFBSilva,AndréRSMarçal,andRubimMAlmeidadaSilva.Eval-uationoffeaturesforleafdiscrimination.InImageAnalysisandRecog-nition,pages197–204.2013.(Citedonpages36,73,74,and75.)

[207]CraigSilverstein,SergeyBrin,andRajeevMotwani.Beyondmarketbas-kets:Generalizingassociationrulestodependencerules.Dataminingandknowledgediscovery,2(1):39–68,1998.(Citedonpage133.)

[208]RajivRanjanSingh,SaileshConjeti,andRahulBanerjee.Anapproachforreal-timestress-trenddetectionusingphysiologicalsignalsinwear-ablecomputingsystemsforautomotivedrivers.In14thInternationalIEEEConferenceonIntelligentTransportationSystems(ITSC),pages1477–1482.IEEE,2011.(Citedonpages96,97,99,and100.)

[209]LindaBSmith.Fromglobalsimilaritiestokindsofsimilarities:thecon-structionofdimensionsindevelopment.InSimilarityandanalogicalreasoning,pages146–178.CambridgeUniversityPress,1989.(Citedonpage25.)

[210]SwetaSnehaandUpkarVarshney.Enablingubiquitouspatientmonitor-ing:Model,decisionprotocols,opportunitiesandchallenges.DecisionSupportSystems,46(3):606–619,2009.(Citedonpage95.)

[211]MohammadSSorower.Aliteraturesurveyonalgorithmsformulti-labellearning.OregonStateUniversity,Corvallis,2010.(Citedonpage83.)

[212]DabySow,DeepakS.Turaga,andMichaelSchmidt.Miningofsensordatainhealthcare:Asurvey.InCharuC.Aggarwal,editor,ManagingandMiningSensorData,pages459–504.SpringerUS,2013.(Citedonpages92,93,96,and97.)

[213]DabySow,DeepakSTuraga,andMichaelSchmidt.Miningofsensordatainhealthcare:asurvey.InManagingandminingsensordata,pages459–504.Springer,2013.(Citedonpage105.)

[214]ThomasLSpaldingandBrianHRoss.Conceptlearningandfeatureinterpretation.Memory&cognition,28(3):439–451,2000.(Citedonpage50.)

186 BIBLIOGRAPHY

[215] Somayajulu Sripada, Ehud Reiter, and Ian Davy. Sumtime-mousam:Configurable marine weather forecast generator. Expert Update, 6(3):4–10, 2003. (Cited on page 31.)

[216] Michael Stacey and Carolyn McGregor. Temporal abstraction in intelli-gent clinical data analysis: A survey. Artificial intelligence in medicine,39(1):1–24, 2007. (Cited on page 93.)

[217] Yu Su, Moray Allan, and Frédéric Jurie. Improving object classificationusing semantic attributes. In Proceedings of the British Machine VisionConference, pages 1–10, 2010. (Cited on page 159.)

[218] Feng-Tso Sun, Cynthia Kuo, Heng-Tze Cheng, Senaka Buthpitiya, Pa-tricia Collins, and Martin Griss. Activity-aware mental stress detectionusing physiological sensors. In Martin Gris and Guang Yang, editors,Mobile Computing, Applications, and Services, volume 76, pages 211–230. Springer Berlin Heidelberg, 2012. (Cited on pages 94, 97, 98,and 101.)

[219] Ron Sun. Artificial intelligence: Connectionist and symbolic approaches.International Encyclopedia of the Social and Behavioral Sciences, Ox-ford: Pergamon/Elsevier, pages 783–789, 2001. (Cited on pages 3and 26.)

[220] Pang-Ning Tan et al. Introduction to data mining. Pearson EducationIndia, 2006. (Cited on page 120.)

[221] Pang-Ning Tan, Vipin Kumar, and Jaideep Srivastava. Selecting theright objective measure for association analysis. Information Systems,29(4):293–313, 2004. (Cited on page 126.)

[222] Bhaskar Thakker and Anoop Lal Vyas. Support vector machine for ab-normal pulse classification. International Journal of Computer Applica-tions, 22(7):13–19, 2011. (Cited on pages 93, 97, 98, 100, and 101.)

[223] Kari Torkkola. Information-theoretic methods. In Feature Extraction,pages 167–185. Springer, 2008. (Cited on page 40.)

[224] Gracian Trivino and Michio Sugeno. Towards linguistic descriptionsof phenomena. International Journal of Approximate Reasoning,54(1):22–34, 2013. (Cited on page 28.)

[225] Gabriella Vigliocco, Lotte Meteyard, Mark Andrews, and StavroulaKousta. Toward a theory of semantic representation. Language andCognition, 1(2):219–247, 2009. (Cited on page 20.)

186BIBLIOGRAPHY

[215]SomayajuluSripada,EhudReiter,andIanDavy.Sumtime-mousam:Configurablemarineweatherforecastgenerator.ExpertUpdate,6(3):4–10,2003.(Citedonpage31.)

[216]MichaelStaceyandCarolynMcGregor.Temporalabstractioninintelli-gentclinicaldataanalysis:Asurvey.Artificialintelligenceinmedicine,39(1):1–24,2007.(Citedonpage93.)

[217]YuSu,MorayAllan,andFrédéricJurie.Improvingobjectclassificationusingsemanticattributes.InProceedingsoftheBritishMachineVisionConference,pages1–10,2010.(Citedonpage159.)

[218]Feng-TsoSun,CynthiaKuo,Heng-TzeCheng,SenakaButhpitiya,Pa-triciaCollins,andMartinGriss.Activity-awarementalstressdetectionusingphysiologicalsensors.InMartinGrisandGuangYang,editors,MobileComputing,Applications,andServices,volume76,pages211–230.SpringerBerlinHeidelberg,2012.(Citedonpages94,97,98,and101.)

[219]RonSun.Artificialintelligence:Connectionistandsymbolicapproaches.InternationalEncyclopediaoftheSocialandBehavioralSciences,Ox-ford:Pergamon/Elsevier,pages783–789,2001.(Citedonpages3and26.)

[220]Pang-NingTanetal.Introductiontodatamining.PearsonEducationIndia,2006.(Citedonpage120.)

[221]Pang-NingTan,VipinKumar,andJaideepSrivastava.Selectingtherightobjectivemeasureforassociationanalysis.InformationSystems,29(4):293–313,2004.(Citedonpage126.)

[222]BhaskarThakkerandAnoopLalVyas.Supportvectormachineforab-normalpulseclassification.InternationalJournalofComputerApplica-tions,22(7):13–19,2011.(Citedonpages93,97,98,100,and101.)

[223]KariTorkkola.Information-theoreticmethods.InFeatureExtraction,pages167–185.Springer,2008.(Citedonpage40.)

[224]GracianTrivinoandMichioSugeno.Towardslinguisticdescriptionsofphenomena.InternationalJournalofApproximateReasoning,54(1):22–34,2013.(Citedonpage28.)

[225]GabriellaVigliocco,LotteMeteyard,MarkAndrews,andStavroulaKousta.Towardatheoryofsemanticrepresentation.LanguageandCognition,1(2):219–247,2009.(Citedonpage20.)

186 BIBLIOGRAPHY

[215] Somayajulu Sripada, Ehud Reiter, and Ian Davy. Sumtime-mousam:Configurable marine weather forecast generator. Expert Update, 6(3):4–10, 2003. (Cited on page 31.)

[216] Michael Stacey and Carolyn McGregor. Temporal abstraction in intelli-gent clinical data analysis: A survey. Artificial intelligence in medicine,39(1):1–24, 2007. (Cited on page 93.)

[217] Yu Su, Moray Allan, and Frédéric Jurie. Improving object classificationusing semantic attributes. In Proceedings of the British Machine VisionConference, pages 1–10, 2010. (Cited on page 159.)

[218] Feng-Tso Sun, Cynthia Kuo, Heng-Tze Cheng, Senaka Buthpitiya, Pa-tricia Collins, and Martin Griss. Activity-aware mental stress detectionusing physiological sensors. In Martin Gris and Guang Yang, editors,Mobile Computing, Applications, and Services, volume 76, pages 211–230. Springer Berlin Heidelberg, 2012. (Cited on pages 94, 97, 98,and 101.)

[219] Ron Sun. Artificial intelligence: Connectionist and symbolic approaches.International Encyclopedia of the Social and Behavioral Sciences, Ox-ford: Pergamon/Elsevier, pages 783–789, 2001. (Cited on pages 3and 26.)

[220] Pang-Ning Tan et al. Introduction to data mining. Pearson EducationIndia, 2006. (Cited on page 120.)

[221] Pang-Ning Tan, Vipin Kumar, and Jaideep Srivastava. Selecting theright objective measure for association analysis. Information Systems,29(4):293–313, 2004. (Cited on page 126.)

[222] Bhaskar Thakker and Anoop Lal Vyas. Support vector machine for ab-normal pulse classification. International Journal of Computer Applica-tions, 22(7):13–19, 2011. (Cited on pages 93, 97, 98, 100, and 101.)

[223] Kari Torkkola. Information-theoretic methods. In Feature Extraction,pages 167–185. Springer, 2008. (Cited on page 40.)

[224] Gracian Trivino and Michio Sugeno. Towards linguistic descriptionsof phenomena. International Journal of Approximate Reasoning,54(1):22–34, 2013. (Cited on page 28.)

[225] Gabriella Vigliocco, Lotte Meteyard, Mark Andrews, and StavroulaKousta. Toward a theory of semantic representation. Language andCognition, 1(2):219–247, 2009. (Cited on page 20.)

186BIBLIOGRAPHY

[215]SomayajuluSripada,EhudReiter,andIanDavy.Sumtime-mousam:Configurablemarineweatherforecastgenerator.ExpertUpdate,6(3):4–10,2003.(Citedonpage31.)

[216]MichaelStaceyandCarolynMcGregor.Temporalabstractioninintelli-gentclinicaldataanalysis:Asurvey.Artificialintelligenceinmedicine,39(1):1–24,2007.(Citedonpage93.)

[217]YuSu,MorayAllan,andFrédéricJurie.Improvingobjectclassificationusingsemanticattributes.InProceedingsoftheBritishMachineVisionConference,pages1–10,2010.(Citedonpage159.)

[218]Feng-TsoSun,CynthiaKuo,Heng-TzeCheng,SenakaButhpitiya,Pa-triciaCollins,andMartinGriss.Activity-awarementalstressdetectionusingphysiologicalsensors.InMartinGrisandGuangYang,editors,MobileComputing,Applications,andServices,volume76,pages211–230.SpringerBerlinHeidelberg,2012.(Citedonpages94,97,98,and101.)

[219]RonSun.Artificialintelligence:Connectionistandsymbolicapproaches.InternationalEncyclopediaoftheSocialandBehavioralSciences,Ox-ford:Pergamon/Elsevier,pages783–789,2001.(Citedonpages3and26.)

[220]Pang-NingTanetal.Introductiontodatamining.PearsonEducationIndia,2006.(Citedonpage120.)

[221]Pang-NingTan,VipinKumar,andJaideepSrivastava.Selectingtherightobjectivemeasureforassociationanalysis.InformationSystems,29(4):293–313,2004.(Citedonpage126.)

[222]BhaskarThakkerandAnoopLalVyas.Supportvectormachineforab-normalpulseclassification.InternationalJournalofComputerApplica-tions,22(7):13–19,2011.(Citedonpages93,97,98,100,and101.)

[223]KariTorkkola.Information-theoreticmethods.InFeatureExtraction,pages167–185.Springer,2008.(Citedonpage40.)

[224]GracianTrivinoandMichioSugeno.Towardslinguisticdescriptionsofphenomena.InternationalJournalofApproximateReasoning,54(1):22–34,2013.(Citedonpage28.)

[225]GabriellaVigliocco,LotteMeteyard,MarkAndrews,andStavroulaKousta.Towardatheoryofsemanticrepresentation.LanguageandCognition,1(2):219–247,2009.(Citedonpage20.)

BIBLIOGRAPHY 187

[226] Thi Hong Nhan Vu, Namkyu Park, Yang Koo Lee, Yongmi Lee,Jong Yun Lee, and Keun Ho Ryu. Online discovery of heart rate vari-ability patterns in mobile healthcare services. Journal of Systems andSoftware, 83(10):1930–1940, 2010. (Cited on pages 95 and 99.)

[227] Wei Wang, Honggang Wang, Michael Hempel, Dongming Peng, HamidSharif, and Hsiao-Hwa Chen. Secure stochastic ecg signals based ongaussian mixture model for e-healthcare systems. IEEE Systems Journal,5(4):564–573, 2011. (Cited on pages 95, 98, and 99.)

[228] Andrew R Webb. Statistical pattern recognition. John Wiley & Sons,2003. (Cited on page 40.)

[229] Eric W. Weisstein. Totally ordered set. From MathWorld–A WolframWeb Resource, 2016. (Cited on page 66.)

[230] Ian H Witten and Eibe Frank. Data Mining: Practical machine learningtools and techniques. Morgan Kaufmann, 2005. (Cited on pages 25and 39.)

[231] Stephen Gang Wu, Forrest Sheng Bao, Eric You Xu, Yu-Xuan Wang,Yi-Fan Chang, and Qiao-Liang Xiang. A leaf recognition algorithm forplant classification using probabilistic neural network. In IEEE Inter-national Symposium on Signal Processing and Information Technology,pages 11–16. IEEE, 2007. (Cited on page 74.)

[232] Baile Xie and Hlaing Minn. Real-time sleep apnea detection by clas-sifier combination. IEEE Transactions on Information Technology inBiomedicine, 16(3):469–477, 2012. (Cited on pages 95 and 97.)

[233] Ronald R Yager. A new approach to the summarization of data. Infor-mation Sciences, 28(1):69–86, 1982. (Cited on page 27.)

[234] Jinn-Yi Yeh, Tai-Hsi Wu, and Chuan-Wei Tsao. Using data mining tech-niques to predict hospitalization of hemodialysis patients. Decision Sup-port Systems, 50(2):439–448, 2011. (Cited on pages 94 and 98.)

[235] Illhoi Yoo, Patricia Alafaireet, Miroslav Marinov, Keila Pena-Hernandez,Rajitha Gopidi, Jia-Fu Chang, and Lei Hua. Data mining in healthcareand biomedicine: a survey of the literature. Journal of medical systems,36(4):2431–2448, 2012. (Cited on pages 92, 94, and 105.)

[236] Jin Yu, Jim Hunter, Ehud Reiter, and Somayajulu Sripada. Recognisingvisual patterns to communicate gas turbine time-series data. In Appli-cations and Innovations in Intelligent Systems, pages 105–118. Springer,2003. (Cited on page 140.)

BIBLIOGRAPHY187

[226]ThiHongNhanVu,NamkyuPark,YangKooLee,YongmiLee,JongYunLee,andKeunHoRyu.Onlinediscoveryofheartratevari-abilitypatternsinmobilehealthcareservices.JournalofSystemsandSoftware,83(10):1930–1940,2010.(Citedonpages95and99.)

[227]WeiWang,HonggangWang,MichaelHempel,DongmingPeng,HamidSharif,andHsiao-HwaChen.Securestochasticecgsignalsbasedongaussianmixturemodelfore-healthcaresystems.IEEESystemsJournal,5(4):564–573,2011.(Citedonpages95,98,and99.)

[228]AndrewRWebb.Statisticalpatternrecognition.JohnWiley&Sons,2003.(Citedonpage40.)

[229]EricW.Weisstein.Totallyorderedset.FromMathWorld–AWolframWebResource,2016.(Citedonpage66.)

[230]IanHWittenandEibeFrank.DataMining:Practicalmachinelearningtoolsandtechniques.MorganKaufmann,2005.(Citedonpages25and39.)

[231]StephenGangWu,ForrestShengBao,EricYouXu,Yu-XuanWang,Yi-FanChang,andQiao-LiangXiang.Aleafrecognitionalgorithmforplantclassificationusingprobabilisticneuralnetwork.InIEEEInter-nationalSymposiumonSignalProcessingandInformationTechnology,pages11–16.IEEE,2007.(Citedonpage74.)

[232]BaileXieandHlaingMinn.Real-timesleepapneadetectionbyclas-sifiercombination.IEEETransactionsonInformationTechnologyinBiomedicine,16(3):469–477,2012.(Citedonpages95and97.)

[233]RonaldRYager.Anewapproachtothesummarizationofdata.Infor-mationSciences,28(1):69–86,1982.(Citedonpage27.)

[234]Jinn-YiYeh,Tai-HsiWu,andChuan-WeiTsao.Usingdataminingtech-niquestopredicthospitalizationofhemodialysispatients.DecisionSup-portSystems,50(2):439–448,2011.(Citedonpages94and98.)

[235]IllhoiYoo,PatriciaAlafaireet,MiroslavMarinov,KeilaPena-Hernandez,RajithaGopidi,Jia-FuChang,andLeiHua.Datamininginhealthcareandbiomedicine:asurveyoftheliterature.Journalofmedicalsystems,36(4):2431–2448,2012.(Citedonpages92,94,and105.)

[236]JinYu,JimHunter,EhudReiter,andSomayajuluSripada.Recognisingvisualpatternstocommunicategasturbinetime-seriesdata.InAppli-cationsandInnovationsinIntelligentSystems,pages105–118.Springer,2003.(Citedonpage140.)

BIBLIOGRAPHY 187

[226] Thi Hong Nhan Vu, Namkyu Park, Yang Koo Lee, Yongmi Lee,Jong Yun Lee, and Keun Ho Ryu. Online discovery of heart rate vari-ability patterns in mobile healthcare services. Journal of Systems andSoftware, 83(10):1930–1940, 2010. (Cited on pages 95 and 99.)

[227] Wei Wang, Honggang Wang, Michael Hempel, Dongming Peng, HamidSharif, and Hsiao-Hwa Chen. Secure stochastic ecg signals based ongaussian mixture model for e-healthcare systems. IEEE Systems Journal,5(4):564–573, 2011. (Cited on pages 95, 98, and 99.)

[228] Andrew R Webb. Statistical pattern recognition. John Wiley & Sons,2003. (Cited on page 40.)

[229] Eric W. Weisstein. Totally ordered set. From MathWorld–A WolframWeb Resource, 2016. (Cited on page 66.)

[230] Ian H Witten and Eibe Frank. Data Mining: Practical machine learningtools and techniques. Morgan Kaufmann, 2005. (Cited on pages 25and 39.)

[231] Stephen Gang Wu, Forrest Sheng Bao, Eric You Xu, Yu-Xuan Wang,Yi-Fan Chang, and Qiao-Liang Xiang. A leaf recognition algorithm forplant classification using probabilistic neural network. In IEEE Inter-national Symposium on Signal Processing and Information Technology,pages 11–16. IEEE, 2007. (Cited on page 74.)

[232] Baile Xie and Hlaing Minn. Real-time sleep apnea detection by clas-sifier combination. IEEE Transactions on Information Technology inBiomedicine, 16(3):469–477, 2012. (Cited on pages 95 and 97.)

[233] Ronald R Yager. A new approach to the summarization of data. Infor-mation Sciences, 28(1):69–86, 1982. (Cited on page 27.)

[234] Jinn-Yi Yeh, Tai-Hsi Wu, and Chuan-Wei Tsao. Using data mining tech-niques to predict hospitalization of hemodialysis patients. Decision Sup-port Systems, 50(2):439–448, 2011. (Cited on pages 94 and 98.)

[235] Illhoi Yoo, Patricia Alafaireet, Miroslav Marinov, Keila Pena-Hernandez,Rajitha Gopidi, Jia-Fu Chang, and Lei Hua. Data mining in healthcareand biomedicine: a survey of the literature. Journal of medical systems,36(4):2431–2448, 2012. (Cited on pages 92, 94, and 105.)

[236] Jin Yu, Jim Hunter, Ehud Reiter, and Somayajulu Sripada. Recognisingvisual patterns to communicate gas turbine time-series data. In Appli-cations and Innovations in Intelligent Systems, pages 105–118. Springer,2003. (Cited on page 140.)

BIBLIOGRAPHY187

[226]ThiHongNhanVu,NamkyuPark,YangKooLee,YongmiLee,JongYunLee,andKeunHoRyu.Onlinediscoveryofheartratevari-abilitypatternsinmobilehealthcareservices.JournalofSystemsandSoftware,83(10):1930–1940,2010.(Citedonpages95and99.)

[227]WeiWang,HonggangWang,MichaelHempel,DongmingPeng,HamidSharif,andHsiao-HwaChen.Securestochasticecgsignalsbasedongaussianmixturemodelfore-healthcaresystems.IEEESystemsJournal,5(4):564–573,2011.(Citedonpages95,98,and99.)

[228]AndrewRWebb.Statisticalpatternrecognition.JohnWiley&Sons,2003.(Citedonpage40.)

[229]EricW.Weisstein.Totallyorderedset.FromMathWorld–AWolframWebResource,2016.(Citedonpage66.)

[230]IanHWittenandEibeFrank.DataMining:Practicalmachinelearningtoolsandtechniques.MorganKaufmann,2005.(Citedonpages25and39.)

[231]StephenGangWu,ForrestShengBao,EricYouXu,Yu-XuanWang,Yi-FanChang,andQiao-LiangXiang.Aleafrecognitionalgorithmforplantclassificationusingprobabilisticneuralnetwork.InIEEEInter-nationalSymposiumonSignalProcessingandInformationTechnology,pages11–16.IEEE,2007.(Citedonpage74.)

[232]BaileXieandHlaingMinn.Real-timesleepapneadetectionbyclas-sifiercombination.IEEETransactionsonInformationTechnologyinBiomedicine,16(3):469–477,2012.(Citedonpages95and97.)

[233]RonaldRYager.Anewapproachtothesummarizationofdata.Infor-mationSciences,28(1):69–86,1982.(Citedonpage27.)

[234]Jinn-YiYeh,Tai-HsiWu,andChuan-WeiTsao.Usingdataminingtech-niquestopredicthospitalizationofhemodialysispatients.DecisionSup-portSystems,50(2):439–448,2011.(Citedonpages94and98.)

[235]IllhoiYoo,PatriciaAlafaireet,MiroslavMarinov,KeilaPena-Hernandez,RajithaGopidi,Jia-FuChang,andLeiHua.Datamininginhealthcareandbiomedicine:asurveyoftheliterature.Journalofmedicalsystems,36(4):2431–2448,2012.(Citedonpages92,94,and105.)

[236]JinYu,JimHunter,EhudReiter,andSomayajuluSripada.Recognisingvisualpatternstocommunicategasturbinetime-seriesdata.InAppli-cationsandInnovationsinIntelligentSystems,pages105–118.Springer,2003.(Citedonpage140.)

188 BIBLIOGRAPHY

[237] Jin Yu, Ehud Reiter, Jim Hunter, and Chris Mellish. Choosing the contentof textual summaries of large time-series data sets. Natural LanguageEngineering, 13(01):25–49, 2007. (Cited on pages 29, 32, 68, 139,and 140.)

[238] Lotfi Zadeh. From computing with numbers to computing with words-from manipulation of measurements to manipulation of perceptions.International Journal of Applied Mathematics and Computer Science,12(3):307–324, 2002. (Cited on page 27.)

[239] Lotfi A Zadeh. Fuzzy logic = computing with words. IEEE transactionson fuzzy systems, 4(2):103–111, 1996. (Cited on pages 27 and 28.)

[240] Lotfi A Zadeh. Toward a theory of fuzzy information granulation and itscentrality in human reasoning and fuzzy logic. Fuzzy sets and systems,90(2):111–127, 1997. (Cited on page 28.)

[241] Lotfi A Zadeh. A new direction in ai: Toward a computational theory ofperceptions. AI magazine, 22(1):73, 2001. (Cited on page 27.)

[242] Frank Zenker and Peter Gärdenfors. Applications of conceptual spaces.(Cited on pages 25 and 42.)

[243] Ying Zhu. Automatic detection of anomalies in blood glucose using amachine learning approach. Journal of Communications and Networks,13(2):125–131, 2011. (Cited on pages 93, 94, 98, 99, 100, and 101.)

188BIBLIOGRAPHY

[237]JinYu,EhudReiter,JimHunter,andChrisMellish.Choosingthecontentoftextualsummariesoflargetime-seriesdatasets.NaturalLanguageEngineering,13(01):25–49,2007.(Citedonpages29,32,68,139,and140.)

[238]LotfiZadeh.Fromcomputingwithnumberstocomputingwithwords-frommanipulationofmeasurementstomanipulationofperceptions.InternationalJournalofAppliedMathematicsandComputerScience,12(3):307–324,2002.(Citedonpage27.)

[239]LotfiAZadeh.Fuzzylogic=computingwithwords.IEEEtransactionsonfuzzysystems,4(2):103–111,1996.(Citedonpages27and28.)

[240]LotfiAZadeh.Towardatheoryoffuzzyinformationgranulationanditscentralityinhumanreasoningandfuzzylogic.Fuzzysetsandsystems,90(2):111–127,1997.(Citedonpage28.)

[241]LotfiAZadeh.Anewdirectioninai:Towardacomputationaltheoryofperceptions.AImagazine,22(1):73,2001.(Citedonpage27.)

[242]FrankZenkerandPeterGärdenfors.Applicationsofconceptualspaces.(Citedonpages25and42.)

[243]YingZhu.Automaticdetectionofanomaliesinbloodglucoseusingamachinelearningapproach.JournalofCommunicationsandNetworks,13(2):125–131,2011.(Citedonpages93,94,98,99,100,and101.)

188 BIBLIOGRAPHY

[237] Jin Yu, Ehud Reiter, Jim Hunter, and Chris Mellish. Choosing the contentof textual summaries of large time-series data sets. Natural LanguageEngineering, 13(01):25–49, 2007. (Cited on pages 29, 32, 68, 139,and 140.)

[238] Lotfi Zadeh. From computing with numbers to computing with words-from manipulation of measurements to manipulation of perceptions.International Journal of Applied Mathematics and Computer Science,12(3):307–324, 2002. (Cited on page 27.)

[239] Lotfi A Zadeh. Fuzzy logic = computing with words. IEEE transactionson fuzzy systems, 4(2):103–111, 1996. (Cited on pages 27 and 28.)

[240] Lotfi A Zadeh. Toward a theory of fuzzy information granulation and itscentrality in human reasoning and fuzzy logic. Fuzzy sets and systems,90(2):111–127, 1997. (Cited on page 28.)

[241] Lotfi A Zadeh. A new direction in ai: Toward a computational theory ofperceptions. AI magazine, 22(1):73, 2001. (Cited on page 27.)

[242] Frank Zenker and Peter Gärdenfors. Applications of conceptual spaces.(Cited on pages 25 and 42.)

[243] Ying Zhu. Automatic detection of anomalies in blood glucose using amachine learning approach. Journal of Communications and Networks,13(2):125–131, 2011. (Cited on pages 93, 94, 98, 99, 100, and 101.)

188BIBLIOGRAPHY

[237]JinYu,EhudReiter,JimHunter,andChrisMellish.Choosingthecontentoftextualsummariesoflargetime-seriesdatasets.NaturalLanguageEngineering,13(01):25–49,2007.(Citedonpages29,32,68,139,and140.)

[238]LotfiZadeh.Fromcomputingwithnumberstocomputingwithwords-frommanipulationofmeasurementstomanipulationofperceptions.InternationalJournalofAppliedMathematicsandComputerScience,12(3):307–324,2002.(Citedonpage27.)

[239]LotfiAZadeh.Fuzzylogic=computingwithwords.IEEEtransactionsonfuzzysystems,4(2):103–111,1996.(Citedonpages27and28.)

[240]LotfiAZadeh.Towardatheoryoffuzzyinformationgranulationanditscentralityinhumanreasoningandfuzzylogic.Fuzzysetsandsystems,90(2):111–127,1997.(Citedonpage28.)

[241]LotfiAZadeh.Anewdirectioninai:Towardacomputationaltheoryofperceptions.AImagazine,22(1):73,2001.(Citedonpage27.)

[242]FrankZenkerandPeterGärdenfors.Applicationsofconceptualspaces.(Citedonpages25and42.)

[243]YingZhu.Automaticdetectionofanomaliesinbloodglucoseusingamachinelearningapproach.JournalofCommunicationsandNetworks,13(2):125–131,2011.(Citedonpages93,94,98,99,100,and101.)

Publications in the series Örebro Studies in Technology

1. Bergsten, Pontus (2001) Observers and Controllers for Takagi – Sugeno Fuzzy Systems. Doctoral Dissertation.

2. Iliev, Boyko (2002) Minimum-time Sliding Mode Control of Robot Manipulators. Licentiate Thesis.

3. Spännar, Jan (2002) Grey box modelling for temperature estimation. Licentiate Thesis.

4. Persson, Martin (2002) A simulation environment for visual servoing. Licentiate Thesis.

5. Boustedt, Katarina (2002) Flip Chip for High Volume and Low Cost – Materials and Production Technology. Licentiate Thesis.

6. Biel, Lena (2002) Modeling of Perceptual Systems – A Sensor Fusion Model with Active Perception. Licentiate Thesis.

7. Otterskog, Magnus (2002) Produktionstest av mobiltelefonantenner i mod-växlande kammare. Licentiate Thesis.

8. Tolt, Gustav (2003) Fuzzy-Similarity-Based Low-level Image Processing. Licentiate Thesis.

9. Loutfi, Amy (2003) Communicating Perceptions: Grounding Symbols to Artificial Olfactory Signals. Licentiate Thesis.

10. Iliev, Boyko (2004) Minimum-time Sliding Mode Control of Robot Manipulators. Doctoral Dissertation.

11. Pettersson, Ola (2004) Model-Free Execution Monitoring in Behavior-Based Mobile Robotics. Doctoral Dissertation.

12. Överstam, Henrik (2004) The Interdependence of Plastic Behaviour and Final Properties of Steel Wire, Analysed by the Finite Element Metod. Doctoral Dissertation.

13. Jennergren, Lars (2004) Flexible Assembly of Ready-to-eat Meals. Licentiate Thesis.

14. Jun, Li (2004) Towards Online Learning of Reactive Behaviors in Mobile Robotics. Licentiate Thesis.

15. Lindquist, Malin (2004) Electronic Tongue for Water Quality Assessment. Licentiate Thesis.

16. Wasik, Zbigniew (2005) A Behavior-Based Control System for Mobile Manipulation. Doctoral Dissertation.




















































17. Berntsson, Tomas (2005) Replacement of Lead Baths with Environment Friendly Alternative Heat Treatment Processes in Steel Wire Production. Licentiate Thesis.

18. Tolt, Gustav (2005) Fuzzy Similarity-based Image Processing. Doctoral Dissertation.

19. Munkevik, Per (2005) ”Artificial sensory evaluation – appearance-based analysis of ready meals”. Licentiate Thesis.

20. Buschka, Pär (2005) An Investigation of Hybrid Maps for Mobile Robots. Doctoral Dissertation.

21. Loutfi, Amy (2006) Odour Recognition using Electronic Noses in Robotic and Intelligent Systems. Doctoral Dissertation.

22. Gillström, Peter (2006) Alternatives to Pickling; Preparation of Carbon and Low Alloyed Steel Wire Rod. Doctoral Dissertation.

23. Li, Jun (2006) Learning Reactive Behaviors with Constructive Neural Networks in Mobile Robotics. Doctoral Dissertation.

24. Otterskog, Magnus (2006) Propagation Environment Modeling Using Scattered Field Chamber. Doctoral Dissertation.

25. Lindquist, Malin (2007) Electronic Tongue for Water Quality Assessment. Doctoral Dissertation.

26. Cielniak, Grzegorz (2007) People Tracking by Mobile Robots using Thermal and Colour Vision. Doctoral Dissertation.

27. Boustedt, Katarina (2007) Flip Chip for High Frequency Applications – Materials Aspects. Doctoral Dissertation.

28. Soron, Mikael (2007) Robot System for Flexible 3D Friction Stir Welding. Doctoral Dissertation.

29. Larsson, Sören (2008) An industrial robot as carrier of a laser profile scanner. – Motion control, data capturing and path planning. Doctoral Dissertation.

30. Persson, Martin (2008) Semantic Mapping Using Virtual Sensors and Fusion of Aerial Images with Sensor Data from a Ground Vehicle. Doctoral Dissertation.

31. Andreasson, Henrik (2008) Local Visual Feature based Localisation and Mapping by Mobile Robots. Doctoral Dissertation.

32. Bouguerra, Abdelbaki (2008) Robust Execution of Robot Task-Plans: A Knowledge-based Approach. Doctoral Dissertation.

















































33. Lundh, Robert (2009) Robots that Help Each Other: Self-Configuration of Distributed Robot Systems. Doctoral Dissertation.

34. Skoglund, Alexander (2009) Programming by Demonstration of Robot Manipulators. Doctoral Dissertation.

35. Ranjbar, Parivash (2009) Sensing the Environment: Development of Monitoring Aids for Persons with Profound Deafness or Deafblindness. Doctoral Dissertation.

36. Magnusson, Martin (2009) The Three-Dimensional Normal- Distributions Transform – an Efficient Representation for Registration, Surface Analysis, and Loop Detection. Doctoral Dissertation.

37. Rahayem, Mohamed (2010) Segmentation and fitting for Geometric Reverse Engineering. Processing data captured by a laser profile scanner mounted on an industrial robot. Doctoral Dissertation.

38. Karlsson, Alexander (2010) Evaluating Credal Set Theory as a Belief Framework in High-Level Information Fusion for Automated Decision-Making. Doctoral Dissertation.

39. LeBlanc, Kevin (2010) Cooperative Anchoring – Sharing Information About Objects in Multi-Robot Systems. Doctoral Dissertation.

40. Johansson, Fredrik (2010) Evaluating the Performance of TEWA Systems. Doctoral Dissertation.

41. Trincavelli, Marco (2010) Gas Discrimination for Mobile Robots. Doctoral Dissertation.

42. Cirillo, Marcello (2010) Planning in Inhabited Environments: Human-Aware Task Planning and Activity Recognition. Doctoral Dissertation.

43. Nilsson, Maria (2010) Capturing Semi-Automated Decision Making: The Methodology of CASADEMA. Doctoral Dissertation.

44. Dahlbom, Anders (2011) Petri nets for Situation Recognition. Doctoral Dissertation.

45. Ahmed, Muhammad Rehan (2011) Compliance Control of Robot Manipulator for Safe Physical Human Robot Interaction. Doctoral Dissertation.

46. Riveiro, Maria (2011) Visual Analytics for Maritime Anomaly Detection. Doctoral Dissertation.











































47. Rashid, Md. Jayedur (2011) Extending a Networked Robot System to Include Humans, Tiny Devices, and Everyday Objects. Doctoral Dissertation.

48. Zain-ul-Abdin (2011) Programming of Coarse-Grained Reconfigurable Architectures. Doctoral Dissertation.

49. Wang, Yan (2011) A Domain-Specific Language for Protocol Stack Implementation in Embedded Systems. Doctoral Dissertation.

50. Brax, Christoffer (2011) Anomaly Detection in the Surveillance Domain. Doctoral Dissertation.

51. Larsson, Johan (2011) Unmanned Operation of Load-Haul-Dump Vehicles in Mining Environments. Doctoral Dissertation.

52. Lidström, Kristoffer (2012) Situation-Aware Vehicles: Supporting the Next Generation of Cooperative Traffic Systems. Doctoral Dissertation.

53. Johansson, Daniel (2012) Convergence in Mixed Reality-Virtuality Environments. Facilitating Natural User Behavior. Doctoral Dissertation.

54. Stoyanov, Todor Dimitrov (2012) Reliable Autonomous Navigation in Semi-Structured Environments using the Three-Dimensional Normal Distributions Transform (3D-NDT). Doctoral Dissertation.

55. Daoutis, Marios (2013) Knowledge Based Perceptual Anchoring: Grounding percepts to concepts in cognitive robots. Doctoral Dissertation.

56. Kristoffersson, Annica (2013) Measuring the Quality of Interaction in Mobile Robotic Telepresence Systems using Presence, Spatial Formations and Sociometry. Doctoral Dissertation.

57. Memedi, Mevludin (2014) Mobile systems for monitoring Parkinson’s disease. Doctoral Dissertation.

58. König, Rikard (2014) Enhancing Genetic Programming for Predictive Modeling. Doctoral Dissertation.

59. Erlandsson, Tina (2014) A Combat Survivability Model for Evaluating Air Mission Routes in Future Decision Support Systems. Doctoral Dissertation.

60. Helldin, Tove (2014) Transparency for Future Semi-Automated Systems. Effects of transparency on operator performance, workload and trust. Doctoral Dissertation.











































61. Krug, Robert (2014) Optimization-based Robot Grasp Synthesis and Motion Control. Doctoral Dissertation.

62. Reggente, Matteo (2014) Statistical Gas Distribution Modelling for Mobile Robot Applications. Doctoral Dissertation.

63. Längkvist, Martin (2014) Modeling Time-Series with Deep Networks. Doctoral Dissertation.

64. Hernández Bennetts, Víctor Manuel (2015) Mobile Robots with In-Situ and Remote Sensors for Real World Gas Distribution Modelling. Doctoral Dissertation.

65. Alirezaie, Marjan (2015) Bridging the Semantic Gap between Sensor Data and Ontological Knowledge. Doctoral Dissertation.

66. Pashami, Sepideh (2015) Change Detection in Metal Oxide Gas Sensor Signals for Open Sampling Systems. Doctoral Dissertation.

67. Lagriffoul, Fabien (2016) Combining Task and Motion Planning. Doctoral Dissertation.

68. Mosberger, Rafael (2016) Vision-based Human Detection from Mobile Machinery in Industrial Environments. Doctoral Dissertation.

69. Mansouri, Masoumeh (2016) A Constraint-Based Approach for Hybrid Reasoning in Robotics. Doctoral Dissertation.

70. Albitar, Houssam (2016) Enabling a Robot for Underwater Surface Cleaning. Doctoral Dissertation.

71. Mojtahedzadeh, Rasoul (2016) Safe Robotic Manipulation to Extract Objects from Piles: From 3D Perception to Object Selection. Doctoral Dissertation.

72. Köckemann, Uwe (2016) Constraint-based Methods for Human-aware Planning. Doctoral Dissertation.

73. Jansson, Anton (2016) Only a Shadow. Industrial Computed Tomography Investigation, and Method Development, Concerning Complex Material Systems. Licentiate Thesis.

74. Sebastian Hällgren (2017) Some aspects on designing for metal Powder Bed Fusion. Licentiate Thesis.

75. Junges, Robert (2017) A Learning-driven Approach for Behavior Modeling in Agent-based Simulation. Doctoral Dissertation.

76. Ricão Canelhas, Daniel (2017) Truncated Signed Distance Fields Applied To Robotics. Doctoral Dissertation.

















































77. Asadi, Sahar (2017) Towards Dense Air Quality Monitoring: Time-Dependent Statistical Gas Distribution Modelling and Sensor Planning. Doctoral Dissertation.

78. Banaee, Hadi (2018) From Numerical Sensor Data to Semantic Representations: A Data-driven Approach for Generating Linguistic Descriptions. Doctoral Dissertation.