hpc over cloud

HPC over Cloud East-West Neo Medicinal u-Lifecare Research Center Workshop January 2014 Presented By: Muhammad Bilal Amin Cloud Computing Team, Ubiquitous Computing Lab Kyung Hee University, Global Campus, Korea

Upload: karlyn

Post on 23-Feb-2016




0 download


HPC over Cloud . East-West Neo Medicinal u- Lifecare Research Center. Workshop January 2014. Presented By: Muhammad Bilal Amin Cloud Computing Team, Ubiquitous Computing Lab. Kyung Hee University, Global Campus, Korea. Agenda. High Performance Computing over Cloud - PowerPoint PPT Presentation


PowerPoint Presentation

HPC over Cloud

East-West Neo Medicinal u-Lifecare Research CenterWorkshop January 2014Presented By:Muhammad Bilal AminCloud Computing Team, Ubiquitous Computing LabKyung Hee University, Global Campus, KoreaThis presentation provides a brief introduction to ITRC Centers Life Care Cloud Infrastructure1AgendaHigh Performance Computing over CloudMotivation for HPC over Cloud (HPCoC)Related workHPCoC ArchitectureHPCoC ContributionSPHeReMotivation for SPHeReImplementation DetailsEvaluationSPHeRes Contributions & AchievementsConclusion2Motivation for HPC over Cloud3

Motivation for HPC over Cloud4

Motivation for HPC over Cloud5

Motivation for HPC over Cloud6

Motivation for HPC over Cloud7

Related Work and Limitations8

HPCoC Architecture (Stack View)

UCLab Cloud infrastructure10

LC cloud service model provides Soft. as. A. service layer to live services built at ITRC Center. Plat. As s Service model inclues hpc libraries and middlewares. Infras. As , a service provides virtualization layers for ITRC researchers and developers.

10UCLab Cloud Infrastructure11Physical MachinePhysical Machine12344 Gb4 Gb8 Gb RAMHard driveWindows 7 x64VM Ware ESXINative OSHypervisorVM1Windows 7VM 2Windows 7Virtual MachinesGuest OS2 Tb124 Gb1 Tb4 Gb1 Tb12Physical MachinePhysical MachinePhysical MachinePhysical Machine34784 Gb4 Gb8 Core i7 CPU16 Gb RAMHard driveXen HypervisorVM 2VM 4Guest OS564 GbVM 3124 GbVM1Linux124 Gb345678250 GbLinux4 Gb250 GbLinux4 Gb250 GbLinux4 Gb250 Gb2 TbJava RuntimeJava RuntimeJava RuntimeJava RuntimeHadoopHadoopHadoopHadoop4 virtual nodes16 virtual nodes20 VirtualNodes

Virtual MachinesLife-care cloud is a private cloud deployment with 20 virtual nodes.Among these, 4 nodes are dedicated for live services, and 16 nodes support ITRC Centers research and development activities.

11HPCoC Contributions & UniquenessA unified Java-based High performance platform for Grande Applications (Data and Computation Intensive).Cloud-enable Java-based HPC messaging and distribution middle-wares e.g. MPJ-Core. MPI-Like messaging with fault tolerance incorporated from Hadoop.Implement parallel computation intensive and data intensive processing on unshared data in MapReduce through In-map/In-reduce parallelism. Green HPC: Virtualized resources are a big step for the HPC to step into green computing and energy efficient.Releasing the solution under an open source licensing for the academic community. 1213

A Performance Initiative towards Large-scale Bio-medical Ontology Matching by Implementing Thread Level Parallelism (TTP) over Multicore PlatformsMotivation for SPHeReEffective ontology matching is a computationally intensive (processing power and memory) operation requiring matching algorithms with quadratic complexity to be executed over candidate ontologiesGross et al. On Matching Large Life Science Ontologies in Parallel, Lecture Notes in Computer Science (LNCS), 2010

Delay in matching results, makes ontology matching ill-equipped for semi-real-time , semantic web-based systemsStoilos et al. A string metric for ontology alignment ISWC05, Heidelberg, Germany 2005

The core techniques for achieving better performance are either related to the optimization of matching algorithms or the fragmentation of ontologies for matching algorithms . Utilization of parallel and distributed platforms has largely been missingP. Shvaiko and J Euzenat Ontology matching: State of the art and future challenges IEEE Transaction on Knowledge and Data Engineering, January 2013Commodity hardware capable of parallelism i.e., multi-core processors over a distributed platform (Cloud)Amin et. Al High Performance Java Sockets (HPJS) for scientific Health Clouds 13th IEEE HealthCom, Beijing 2012

Cloud is affordable (utility-based pricing), cloud is available (ubiquitous)Armbrust et al. A view of Cloud Computing ACM Communication April 2010

14Research Opportunity: Ontology Matching over parallel and distributed commodity hardware14Implementation ChallengesEnd to end Parallelism15

Resolution:Methodology to exploit for parallelism from loading till delivery15Implementation Challenges2. Memory Strain

Amount of related information not required at the moment of time, flooding Memory

Parsing and Loading for Inference vs. Parsing and Loading for Matching

Java Heap Blow-up (2 GB Heap is not Enough)

Unable to iterate over properties of FMA and NCI

Cloud Instances have limited memory per instance162. Resolution:Load what we need (Smaller memory foot print during execution)16Implementation Challenges3. Accuracy Preservation173. Resolution:Decoupling of Matching Algorithms from Distribution

17Implementation Challenges4. Thread Safety

Shared ontology data among multiple threads (synchronize access leads to sequential execution)

The available owl frameworks are not thread safe

Result guarantee184. Resolution:Thread Safe ontology model, shared among multithreaded execution18Implementation Challenges5. Scalability with optimal resource utilization

Exploit the available computational resources for concurrency with equality (Effective load balancing)

Implementation of right parallelism technique (partitioning)

Better reduction rate195. Resolution:Effective distribution of matching requests over available computational resources19SPHeRe Architecture20

2021Matcher DistributionThe matching request received by the system is subdivided from macro (matching request) to micro (matching task) level

22Matcher Distribution

23Inter-node Communication

24Mappings AggregationResponsible for accumulating the matched results, creating a corresponding Bridge Ontology (Mapping), and its delivery

Large Scale Biomedical Ontology Matching tool over High Performance Computing

SPHeRe Performance Evaluation

For performance evaluation of HPCoC is tested with large scale ontology matching problem2526Scenario I: Multicore desktop

27Scenario II: 4 VM Cloud

Ontology Loading Time28

3 x Faster Loading timeTotal Memory Footprint29

8 x Memory efficientScalability (Reduction Score)30

Outperforms by 40%Performance Evaluation31

~4 to 8 x Performance efficientPerformance Evaluation (FMAxNCI)32

Performance Evaluation (FMAxSNOMED)33

Performance Evaluation (NCIxSNOMED)34

Uniqueness / ContributionsExploitation of Parallel Commodity hardware for matchingImplementing data parallelism based distribution over subsets of candidate ontologies of ontology subsets over multicore hardware of multicore platform and provides a collection of mappings among the ontologies as a bridge ontology file

End-to-End Performance Initiative (from loading till delivery)Creating subsets of ontologies depending on the needs of matching algorithms and caches them in serialized formats, providing a single-step ontology loading for matching algorithms in parallel

Smaller Memory footprintEach subset is lightweight due to matcher-based and redundancy-free creation, providing smaller memory footprints and contributing in overall system performance

Better ScalabilityUtilization of computational resources most efficiently with the help of its matching task distribution

35AchievementsOAEI 2013. Evaluation at ISWC 2013 (A-Rated Conference)

SPHeRe was presented and evaluated over large-scale biomedical trackIt was remarked as the first Ontology Matching system that utilizes distributed Cloud resourceOur first release of this year ranked among the top-15 systems of 2013 (globally)

Microsoft Research Asia Award 2013-2014Research Funding Awarded by Microsoft Research Asia for SPHeRe over Microsoft Azure platform.

Microsoft Azure4Research Award 2014-2015SPHeRe for Large scale Biomedical Ontology Matching over Microsoft Azure Platform

36PublicationsConferencesWajahat Ali Khan, Muhammad Bilal Amin, Asad Masood Khattak, Maqbool Hussain, and Sungyoung Lee, System for Parallel Heterogeneity Resolution (SPHeRe) results for OAEI 201312th Int. Semantic Web Conference (ISWC), 21-25 October 2013, Sydney, Australia. Ammar Ahmad Awan, Muhammad Bilal Amin, Shujaat Hussain, Aamir Shafi and Sungyoung Lee, An MPI-IO Compliant Java based Parallel I/O library, 13th IEEE CCGrid. Delft , Netherlands, May 2013Ammar Ahmad Awan, Muhammad Shoaib Ayub, Aamir Shafi and Sungyoung Lee, Towards Efficient Support for Parallel I/O in Java HPC, 13th PDCAT, Beijing 2012. Muhammad Bilal Amin, Wajahat Ali Khan, Shujaat Hussain and Sungyoung Lee, High Performance Java Sockets (HPJS) for healthcare cloud systems, 13th HealthCom 2012, Beijing, Oct 2012.Muhammad Bilal Amin, Wajahat Ali Khan, Ammar Ahmad Awan and Sungyoung Lee, Intercloud Message Exchange Middleware, 7th ICUIMC 2012, Kuala Lampur, Malaysia, Feb 2012.


Muhammad Bilal Amin, Wajahat Ali Khan and Sungyoung Lee, SPHeRe: A performance initiative towards ontology matching by implementing parallelism over cloud platforms, Jr. of Supercomputing (SCI, IF 0.9), 2013

Wajahat Ali Khan, Maqbool Hussain, Muhammad Afzal, Muhammad Bilal Amin, Muhammad Aamir Saleem, and Sungyoung Lee, Personalized-Detailed Clinical Model for Data Interoperability among Clinical Standards, Telemedicine and EHealth (SCI, IF:1.416), 2013

Muhammad Bilal Amin, Wajahat Ali Khan and Sungyoung Lee, Enabling Data Parallelism for Large-scale Bio-medical Ontology Matching over Multicore Platforms, Jr. of Applied Intelligence (SCI, IF 1.8) (under review), 2014

38ConclusionHPC over cloud is a very cost effective solution with all the ability that can be provided by expensive clusters or gridsTo fully exploit its utilization, efforts are required to implement platforms and applications for computation and data intensive problems.Applications like SPHeRe can be built to provide resolution of compute and data intensive problems over multicore platforms for performance needs.Commodity hardware consumes lesser man hours for maintenance and consume far less of energy which makes it an excellent candidate for Green Computing.39Thank youReferencesN. Carriero, M. V. Osier, K.-H. Cheung, P. L. Miller, M. Gerstein, H. Zhao, B. Wu, S. Rifkin, J. T. Chang, H. Zhang, K. White, K. Williams, M. H. Schultz, Case report: A high productivity/low main- tenance approach to high-performance computation for biomedicine: Four case studies., JAMIA 12 (1) (2005) 9098. G. Bueno, R. Gonzlez, O. Dniz, M. Garca-Rojo, J. Gonzlez-Garca, M. Fernndez-Carrobles, N. Vllez, J. Salido, A parallel solution for high resolution histological image analysis, Computer Methods and Programs in Biomedicine 108 (1) (2012) 388 401. doi:http://dx.doi.org/10.1016/j.cmpb.2012. 03.007.F. Perez, J. Huguet, R. Aguilar, L. Lara, I. Larrabide, M. Villa-Uriol, J. Lpez, J. Macho, A. Rigo, J. Rossell, S. Vera, E. Vivas, J. Fernndez, A. Arbona, A. Frangi, J. H. Jover, M. G. Ballester, Radstation3g: A platform for cardiovascular image analysis integrating pacs, 3d+t visualization and grid computing, Computer Methods and Programs in Biomedicine 110 (3) (2013) 399 410. doi:http://dx.doi.org/10.1016/j.cmpb.2012.12.002. A. Eklund, M. Andersson, H. Knutsson, fmri analysis on the gpupossibilities and challenges, Computer Methods and Programs in Biomedicine 105 (2) (2012) 145 161. doi:http://dx.doi.org/10.1016/ j.cmpb.2011.07.007. E. I. Konstantinidis, C. A. Frantzidis, C. Pappas, P. D. Bamidis, Real time emotion aware applications: A case study employing emotion evocative pictures and neuro-physiological sensing enhanced by graphic processor units, Computer Methods and Programs in Biomedicine 107 (1) (2012) 16 27, advances in Biomedical Engineering and Computing: the conference case. doi:http://dx.doi.org/10.1016/j. cmpb.2012.03.008. H. L opez-Fern andez, M. Reboiro-Jato, D. Glez-Pea, F. Aparicio, D. Gachet, M. Buenaga, F. Fdez- Riverola, Bioannote: A software platform for annotating biomedical documents with application in medical learning environments, Computer Methods and Programs in Biomedicine 111 (1) (2013) 139 147. doi:http://dx.doi.org/10.1016/j.cmpb.2013.03.007. J. Cimino, X. Zhu, of on, IMIA Yearbook of Medical 1 (1) (2006) 124135. D. Isern, D. Snchez, A. Moreno, Ontology-driven execution of clinical guidelines, Computer Methods and Programs in Biomedicine 107 (2) (2012) 122 139. doi:http://dx.doi.org/10.1016/j.cmpb. 2011.06.006. P. De Potter, H. Cools, K. Depraetere, G. Mels, P. Debevere, J. De Roo, C. Huszka, D. Colaert, E. Mannens, R. Van De Walle, Semantic patient information aggregation and medicinal decision support, Comput. Methods Prog. Biomed. 108 (2) (2012) 724735. doi:10.1016/j.cmpb.2012.04.002. URL http://dx.doi.org/10.1016/j.cmpb.2012.04.002