cdt projects 2013-14 john keane, software systems group [email protected] 1. data analytics / big...
TRANSCRIPT
![Page 1: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/1.jpg)
CDT PROJECTS 2013-14John Keane, Software Systems Group
1. Data Analytics / Big Data
2. Parallel & Distributed Systems
3. Decision Support Systems
HAPPY TO DISCUSS
![Page 2: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/2.jpg)
With Nenadic
CHALLENGE
• Investigate:
– Applications: characteristics and predictability
– Data Analytic / Machine Learning Algorithms – relatively simple so far
– Software: Map-Reduce, Hadoop
– Hardware: various platforms
Big Data Analytics (IBM funded)
![Page 3: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/3.jpg)
With Nenadic, Zeng, Stivaros (Consultant, RMCH)
• Adverse drug event detection (EU funded)
– Bayesian/Fuzzy association rules algorithms
CHALLENGE
– Compare/contract accuracy of prediction
• Clinical Outcome Mining (Christie Hospital)
– Data/text-based clinical records – better diagnose and predict
CHALLENGE
– Illness staging; multi-modal data; changes over time;
• Decision Support for Radiology (NIHR-funded)
– Decision aid to assist better description of scans
CHALLENGE
– Usability; Integration with existing tools; Link to literature
Bio-medical data analytics
![Page 4: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/4.jpg)
• Colossal itemsets:
- Very high dimensional datasets
- Run-time increases exponentially as average row length increases;
• Minimal unique itemsets (MUI) SUDA: Special Unique Detection
- “risky” records, those likely to be linked– 16 years old + widow
- Records of most concern have many, small MUIs
- SUDA s/w used by ONS, UK; licensed by Singaporean govt;
- Algorithm used by UN/World Bank International Household Survey
CHALLENGES:
• Data structure to represent itemsets during search process
• Search space pruning
• Algorithm: bottom-up; top-down; hybrid;
• Parallelism
Itemset Mining Algorithms {baby nappies}->{beer}
![Page 5: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/5.jpg)
Eco-service composition (EU funded)
with Mehandjiev, MBS
• Aims to determine conditions for achieving eco-friendly, resilient and optimal service compositions on a distributed cloud infrastructure
• Two service optimisation approaches deployed:
1. Global: analyses end-to-end interaction between services
2. Local: computes local optimization by creating dynamic service chains between service provider/consumer
CHALLENGE
• Energy-efficient load balance and scheduling
![Page 6: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/6.jpg)
HPC + Finance (EU funded, UK Government)
• High Frequency Trading– Flash crashes: dramatic sudden drop in share price
describe/predict
– Working paper: High Frequency Trading and Mini Flash Crashes http://arxiv.org/abs/1211.6667
• HPCFinance
• New models of risk analysis (diverse data integration)
• Role of HPC in Finance and comparison of technologies
• Trade-off: accuracy, speed, cost comparison: Cloud; GPGPUs, FPGA (Maxeler box)
CHALLENGES:
Data engineering;
Analytics;
Algorithms;
High performance;
![Page 7: CDT PROJECTS 2013-14 John Keane, Software Systems Group jak@cs.man.ac.uk 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support](https://reader036.vdocuments.us/reader036/viewer/2022082818/56649eb35503460f94bbb0ae/html5/thumbnails/7.jpg)
Preference Elicitation from Pairwise Comparison
with Mikhailov, MBS; Siraj, COMSATS IIT, Pakistan
• Decision making is complex in presence of uncertainty and insufficient knowledge.
• Aim to estimate preference using pairwise comparison: PC used when unable to assign scores to available options; judgements provided may be inconsistent
• Work has proposed consistency measures and prioritization measures where revision not allowed.
• PriEsT tool now has sensitivity analysis -> best solution.
• CHALLENGES
– Evolutionary approach to multi-criteria DSS
– Work on preference elicitation model and tool
– Group decision making
– Bridge PriEsT and R (popular data mining tool) via XMCDA