cdt projects 2013-14 john keane, software systems group [email protected] 1. data analytics / big...

7
CDT PROJECTS 2013-14 John Keane, Software Systems Group [email protected] 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support Systems HAPPY TO DISCUSS

Upload: osborn-blake

Post on 31-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

CDT PROJECTS 2013-14John Keane, Software Systems Group

[email protected]

1. Data Analytics / Big Data

2. Parallel & Distributed Systems

3. Decision Support Systems

HAPPY TO DISCUSS

Page 2: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

With Nenadic

CHALLENGE

• Investigate:

– Applications: characteristics and predictability

– Data Analytic / Machine Learning Algorithms – relatively simple so far

– Software: Map-Reduce, Hadoop

– Hardware: various platforms

Big Data Analytics (IBM funded)

Page 3: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

With Nenadic, Zeng, Stivaros (Consultant, RMCH)

• Adverse drug event detection (EU funded)

– Bayesian/Fuzzy association rules algorithms

CHALLENGE

– Compare/contract accuracy of prediction

• Clinical Outcome Mining (Christie Hospital)

– Data/text-based clinical records – better diagnose and predict

CHALLENGE

– Illness staging; multi-modal data; changes over time;

• Decision Support for Radiology (NIHR-funded)

– Decision aid to assist better description of scans

CHALLENGE

– Usability; Integration with existing tools; Link to literature

Bio-medical data analytics

Page 4: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

• Colossal itemsets:

- Very high dimensional datasets

- Run-time increases exponentially as average row length increases;

• Minimal unique itemsets (MUI) SUDA: Special Unique Detection

- “risky” records, those likely to be linked– 16 years old + widow

- Records of most concern have many, small MUIs

- SUDA s/w used by ONS, UK; licensed by Singaporean govt;

- Algorithm used by UN/World Bank International Household Survey

CHALLENGES:

• Data structure to represent itemsets during search process

• Search space pruning

• Algorithm: bottom-up; top-down; hybrid;

• Parallelism

Itemset Mining Algorithms {baby nappies}->{beer}

Page 5: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

Eco-service composition (EU funded)

with Mehandjiev, MBS

• Aims to determine conditions for achieving eco-friendly, resilient and optimal service compositions on a distributed cloud infrastructure

• Two service optimisation approaches deployed:

1. Global: analyses end-to-end interaction between services

2. Local: computes local optimization by creating dynamic service chains between service provider/consumer

CHALLENGE

• Energy-efficient load balance and scheduling

Page 6: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

HPC + Finance (EU funded, UK Government)

• High Frequency Trading– Flash crashes: dramatic sudden drop in share price

describe/predict

– Working paper: High Frequency Trading and Mini Flash Crashes http://arxiv.org/abs/1211.6667

• HPCFinance

• New models of risk analysis (diverse data integration)

• Role of HPC in Finance and comparison of technologies

• Trade-off: accuracy, speed, cost comparison: Cloud; GPGPUs, FPGA (Maxeler box)

CHALLENGES:

Data engineering;

Analytics;

Algorithms;

High performance;

Page 7: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support

Preference Elicitation from Pairwise Comparison

with Mikhailov, MBS; Siraj, COMSATS IIT, Pakistan

• Decision making is complex in presence of uncertainty and insufficient knowledge.

• Aim to estimate preference using pairwise comparison: PC used when unable to assign scores to available options; judgements provided may be inconsistent

• Work has proposed consistency measures and prioritization measures where revision not allowed.

• PriEsT tool now has sensitivity analysis -> best solution.

• CHALLENGES

– Evolutionary approach to multi-criteria DSS

– Work on preference elicitation model and tool

– Group decision making

– Bridge PriEsT and R (popular data mining tool) via XMCDA