truegrid project seminar - tu dresden · 2013. 10. 28. · truegrid project seminar my research...

18
TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas [email protected] 1 University of Technology of Dresden (TU Dresden) 2 Federal University of Campina Grande (UFCG) 3 Federal Institute of Education, Science and Technology of Paraíba (IFPB) Campina Grande Campus October 25th, 2013 Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 1 / 17

Upload: others

Post on 02-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

TruEGrid Project SeminarMy Research Topic

Ana Cristina Alves de Oliveira [email protected]

1University of Technology of Dresden (TU Dresden)

2Federal University of Campina Grande (UFCG)

3Federal Institute of Education, Science and Technology of Paraíba (IFPB)Campina Grande Campus

October 25th, 2013

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 1 / 17

Page 2: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Outline

1 My Work in TU Dresden

2 Cost Model

3 Scheduling Performance EvaluationMethodologyObtained Results

4 My PhD: Previous and Future Work

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 2 / 17

Page 3: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

My Work in TU Dresden

Work at TU Dresden: From June to December

1 Design of a cost model for Data-as-a-Service2 Design of a scheduling system3 Development of a scheduling simulator4 Performance evaluation

Preliminary results

5 Deployment and demo of the scheduling system into a productionenvironment (current)

6 Improvements to the scheduling system7 Technical report writing

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 3 / 17

Page 4: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

My Work in TU Dresden

Work at TU Dresden: From June to December

1 Design of a cost model for Data-as-a-Service2 Design of a scheduling system3 Development of a scheduling simulator4 Performance evaluation

Preliminary results

5 Deployment and demo of the scheduling system into a productionenvironment (current)

6 Improvements to the scheduling system7 Technical report writing

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 3 / 17

Page 5: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Cost Model

Datacenter and Scheduling Representation

Figure: Scheduling architecture

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 4 / 17

Page 6: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Cost Model

VM Access to the Replicated Data

Query VMs access single partition, but data is replicated

Figure: VM Query accessing a data replica from the storage

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 5 / 17

Page 7: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Cost Model

Definition of Variables

Figure: Query VM accessing a data replica from the storage

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 6 / 17

Page 8: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Methodology

Scheduling Strategies

1 Cost-based

Based on partial informationk and R are not known, thus not applied to decide on the VM placement

2 Cost-based+

Based on complete and perfect informationk and R are known and used to decide on the VM placement

3 Random

We select randomly an i in replica set {A, B, C}

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 7 / 17

Page 9: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Methodology

Assumptions

One query per VM

Each query selects data from 1 file

No file is shared among queries

Each datacenter host may have at most 8 VMs running (according to VMresource requirements)

All VMs have the same computational resources

All queries have the same input and output sizes

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 8 / 17

Page 10: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Methodology

Treatments and System Model

Table: Evaluation Treatments

Nr Parameter Factors Levels

1 #DCs 402 #Hosts per DC {1,5}

Table: System Model

Nr Parameter Configuration

1 v (cpu cost) Random double in the range [0.065,3.41)2 t (bandwidth cost) Random double in the range [0.015,0.51)3 q Random double in the range [0.05,1)4 k Random double in the range [0.7,3.5)5 #queries = #VMs = #files 406 #file replicas 3 (randomly placed among datacenters)7 Max(#VMs per host) 88 Output size (bytes) 100

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 9 / 17

Page 11: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Obtained Results

Understanding the Simulation

Let D be the set of datacenters:D = {d1,d2, . . . ,d|D|}, where |D|= 40

Let Q be the set of queries:Q = {q1,q2, . . . ,q|Q|}, where |Q|= 40

Let P(qz ,di ,dj) be the price of executing query qz at datacenter di byretrieving data from dj ; where 1≤ i, j ≤ |D| and 1≤ z ≤ |Q|Each experiment treatment was replicated 35 times

The results show the mean values for the normalized price with aconfidence interval of 95%

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 10 / 17

Page 12: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Obtained Results

Understanding the Normalized Price Metric

Normalized Price: NP(qz ,e, r)

The normalized price of the query qz in the r -esim replica of experiment e:

NP(qz ,e, r) =P(qz ,di ,dj)

minPqz

;di and dj were chosen by the scheduler (1)

minPqz = minP(qz ,di ,dj);∀di ,dj ∈ D

Note: The datacenters with available resources to schedule a VM will vary according to the number of VMsthat are actually running on them. The prices are based on the initial set of DCs.

Optimal Normalized Price

The scheduling objective (to achieve the minimal price):

NP(qz ,e, r)→ 1;∀z,e, r (2)

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 11 / 17

Page 13: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Obtained Results

Treatment 1: 40 DCs and 1 Host/DC (up to 8 VMs each)

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 12 / 17

Page 14: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Performance Evaluation Obtained Results

Treatment 2: 40 DCs and 5 Hosts/DC (up to 40 VMs each)

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 13 / 17

Page 15: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

My PhD and Research Interest

My PhD and Research Interest

I worked on the design and implemention of a SLA monitoring softwarefunded by the Brazilian National Network for Research and Education(RNP - Rede Nacional de Ensino e Pesquisa) under the Just-in-Time(JiT) Clouds Project

JiT Clouds is an open source middleware to federate resources intoprivate, public or hibrid cloudshttp://jitclouds.lsd.ufcg.edu.br

Research interests:1 Monitoring and analysis of network traffic2 Network anomaly detection3 Accounting and pricing of cloud computing services

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 14 / 17

Page 16: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

My PhD and Research Interest

Network Traffic Monitoring Architecture

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 15 / 17

Page 17: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

My PhD and Research Interest

Integration of the Engine with a Cloud Platform

Figure: Example suits the Jit-Clouds Platform

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 16 / 17

Page 18: TruEGrid Project Seminar - TU Dresden · 2013. 10. 28. · TruEGrid Project Seminar My Research Topic Ana Cristina Alves de Oliveira Dantas cristina@copin.ufcg.edu.br 1University

Summary

Outlook

Currently working with VM prices based on network and CPU costs

We intend to extend the VM cost model to place VMs also considering theenergy efficiency (aligned with the LEADS Project)

As part of my PhD, we also intend to model the costs of network traffic inthe presence of SLA breaches, which will be taken into account to thebilling service and the decision making system

Ana Cristina Oliveira (TU Dresden) TruEGrid Project Seminar October 25th, 2013 17 / 17