![Page 1: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/1.jpg)
1
Satisfying Strong Application Requirementsin Data-Intensive Clouds
Ph.D Final ExamBrian Cho
![Page 2: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/2.jpg)
2
Motivating scenario: Using thedata-intensive cloud
• Researchers contract with defense agency to investigate ongoing suspicious activity– e.g., botnet attack, worm, etc.– Other applications: processing
click logs, news items, etc.
1. Transfer large logs (TBs-PBs) from possible victim sites
2. Run computations on logs to find vulnerabilities and source of attack
3. Store data
![Page 3: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/3.jpg)
3
Can today’s data-intensive cloud meet these demands?
The researchers require:1. Control over time and $ cost of
transfer, to stay within the contracted budget and time
2. Prioritization of this time-sensitive job over other jobs in its cluster
3. Consistent updates and reads at data store
• Current limitation: Systems are built to optimize key metrics at large scales, but not to meet these strong user requirements
![Page 4: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/4.jpg)
4
Strong user requirements
• Many real-world requirements are too important to relax– Time– $$$– Priority– Data consistency
• It is essential to treat these strong requirements as problem constraints– … not just as side effects of resource
limitations in the cloud
![Page 5: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/5.jpg)
5
Thesis statement
• It is feasible to satisfy strong application requirements for data-intensive cloud computing environments, in spite of resource limitations, while simultaneously optimizing run-time metrics.– Strong application requirements: real-time deadlines,
dollar budgets, data consistency, etc.– Resource limitations: finite compute nodes, limited
bandwidth, high latency, frequent failures, etc.– Run-time metrics: throughput, latency, $ cost, etc.
![Page 6: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/6.jpg)
6
Contributions: Practical solutionsSolution Strong user
requirementKey optimized metric
Natjam Prioritize production jobs
Job completion time
Vivace[USENIX ATC 2012] Consistency Low latency
Key-value Storage
Computation
Pandora-A[ICDCS 2010] Deadline Low $ cost
Pandora-B[ICAC 2011] $ Budget Short transfer time
Bulk Data Transfer
![Page 7: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/7.jpg)
7
Pandora-A: Bulk Data Transfer via Internet and Shipping Networks
• Minimize $ costsubject to time deadline
• Transfer options– Internet links with proportional costs
but limited bandwidth– Shipping links with fixed costs and
shipping times depending on method (e.g. ground, air)
• Solution– Transform into time-expanded network– Solve min-cost flow on network
• Trace-driven experiments– Pandora-A solutions better than direct
Internet or shipping
![Page 8: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/8.jpg)
8
Pandora-B: Bulk Data Transfer via Internet and Shipping Networks
• Minimize transfer timesubject to $ budget– Bounded binary search on Pandora-A
solutions– Bounds created by transforming time-
expanded networks
B
Transfer Time T (hrs)
Dolla
r Cos
t ($)
UBLB
![Page 9: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/9.jpg)
9
Vivace: Consistent data for congested geo-distributed systems
• Strongly consistent key-value store– Low latency across geo-distributed
data centers– Under congestion
• New algorithms– Prioritize a small amount of critical
information– To avoid delay due to congestion
• Evaluated using a practical prioritization infrastructure
![Page 10: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/10.jpg)
10
Natjam: Prioritizing production jobsin MapReduce/Hadoop
• Mixed workloads– Production jobs
• Time sensitive• Directly affect revenue
– Research jobs• e.g., long term analysis
• Example: Ad provider
Count clicks
Update ads
Slow counts → Show old ads → Don’t get
paid $$$
Ad click-through logs
Is there a better way to place ads?
Run machine learning analysis
Lots of historical logs. Need a large cluster.
Prioritize production jobs
![Page 11: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/11.jpg)
11
Contributions
• Natjam prioritizes production jobs• While giving research jobs spare capacity
• Suspend/Resume tasks in research jobs– Production jobs can gain resources immediately– Research jobs can use many resources at a time,
without wasting work• Develop eviction policies that choose which tasks
to suspend
![Page 12: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/12.jpg)
12
Natjam Outline
• Motivation• Contributions• Background: MapReduce/Hadoop• State-of-the-art• Solution: Suspend/Resume• Design• Evaluation
![Page 13: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/13.jpg)
13
Background: MapReduce/Hadoop• Distributed computation on large
cluster• Each job consists of Map and Reduce
tasks• Job stages
1. Map tasks run computations in parallel2. Shuffle combines intermediate Map
outputs3. Reduce tasks run computations in
parallel
M M
M
M M
R
R R
![Page 14: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/14.jpg)
14
Background: MapReduce/Hadoop• Distributed computation on large
cluster• Each job consists of Map and Reduce
tasks• Job stages
1. Map tasks run computations in parallel2. Shuffle combines intermediate Map
outputs3. Reduce tasks run computations in
parallel
• Map input/Reduce output stored in distributed file system (e.g. HDFS)
• Scheduling: Which task to run on empty resources (slots)
M M
M
M M
R
R R
R
M
M M
M M
M
R R
M
M
M
M M
R R
Job 1
Job 2
Job 3
![Page 15: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/15.jpg)
15
State-of-the-art: Separate clusters
• Submit production jobs to a production cluster
• Submit research jobs to a research cluster
![Page 16: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/16.jpg)
16
State-of-the-art: Separate clusters
• Submit production jobs to a production cluster
• Submit research jobs to a research cluster
• Trace of job submissions to Yahoo production cluster
• Periods of under-utilization, where research jobs could potentially fill in
# R
educ
e sl
ots
Reduce slot capacity
( under- utilization )
Plot used with permission from Yahoo
020
0040
0060
0080
0010
000
time (hours:mins)
0:20 1:000:40
![Page 17: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/17.jpg)
17
State-of-the-art: Single clusterHadoop scheduling
• Ideally,– Enough capacity for
production jobs– Run research tasks on all
idle production slots• But,
– Killing tasks (e.g. Fair Scheduler) can lead to wasted work
Plot used with permission from Yahoo
wasted work
# R
educ
e sl
ots
Reduce slot capacity
( under- utilization )
020
0040
0060
0080
0010
000
time (hours:mins)
0:20 1:000:40
![Page 18: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/18.jpg)
18
State-of-the-art: Single clusterHadoop scheduling
• Ideally,– Enough capacity for
production jobs– Run research tasks on all
idle production slots• But,
– Killing tasks (e.g. Fair Scheduler) can lead to wasted work
– No preemption (e.g. Capacity Scheduler) can lead to production jobs waiting for resources
Plot used with permission from Yahoo
# R
educ
e sl
ots
Reduce slot capacity
production jobs aren’tassigned resources
020
0040
0060
0080
0010
000
time (hours:mins)
0:20 1:000:40
![Page 19: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/19.jpg)
19
Approach: Suspend/Resume• Suspend/Resume tasks
within and across research jobs– Production jobs can gain
resources immediately– Research jobs can use many
resources at a time, without wasting work
• Focus on Reduce tasks– Reduce tasks take longer, so
more work to lose (median Map 19 seconds vs. Reduce 231 seconds [Facebook])
Plot used with permission from Yahoo
# R
educ
e sl
ots
Reduce slot capacity
020
0040
0060
0080
0010
000
time (hours:mins)
0:20 1:000:40
![Page 20: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/20.jpg)
20
Goals: Prioritize production jobs
• Requirement: Production jobs should have the same completion time as if they were executed in an exclusive production cluster– Possibly with a small overhead
• Optimization: Research jobs should have the shortest completion time possible
• Constraint: Finite cluster resources
![Page 21: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/21.jpg)
21
Challenges
• Avoid Suspend overhead– Would require production jobs to wait for resources
• Avoid Resume overhead– Would delay research jobs from making progress
• Optimize task evictions– Job completion time is metric that users care about– Develop eviction policies that have the least impact on
job completion times
![Page 22: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/22.jpg)
22
Natjam Design
• Motivation• Contributions• Background: MapReduce/Hadoop• State-of-the-art• Solution: Suspend/Resume• Design• Evaluation
• Scheduler– Hadoop → Natjam
• Architecture– Hadoop → Natjam
• Suspend/Resume tasks
• Eviction Policies– Task– Job
![Page 23: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/23.jpg)
23
Background: Capacity Scheduler• Limitation: research jobs
cannot scale down• Hadoop capacity shared
using queues– Guaranteed capacity (G)– Maximum capacity(M)
![Page 24: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/24.jpg)
24
Background: Capacity Scheduler• Limitation: research jobs
cannot scale down• Hadoop capacity shared
using queues– Guaranteed capacity (G)– Maximum capacity(M)
• Example– Production (P) queue:
G 80%/M 80%– Research (R) queue:
G 20%/M 40%
1. Production jobsubmitted first:
2. Research jobsubmitted first:
time →
P takes 80%(under-utilization)
R grows to 40%
time →
R takes 40%(under-utilization)
P cannot grow beyond 60%
![Page 25: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/25.jpg)
25
Natjam Scheduler
• Does not require Maximum capacity
• Scales down research jobs
![Page 26: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/26.jpg)
26
Natjam Scheduler
• Does not require Maximum capacity
• Scales down research jobs
1. P/R Guaranteed: 80%/20%
2. P/R Guaranteed: 100%/0%
time →
R takes 100%
P takes 80%
time →
R takes 100%
P takes 100%
Prioritize Production Jobs
![Page 27: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/27.jpg)
27
Background: Hadoop YARN architecture
• Resource Manager• Application Master
per application
• Tasks are launched on containers of memory– Formerly, slots in
Hadoop
Resource ManagerCapacity Scheduler
Node A Node BNode Manager A
Application Master 1
Node Manager B
Application Master 2
Task (App2)
ask container
(empty container)
Task (App1)
![Page 28: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/28.jpg)
28
Suspend/Resume architecture
• Preemptor– Decides when
resources should be reclaimed from queues
– Chooses victim job• Releaser
– Chooses task to evict• Local Suspender
– Saves state– Promptly exits
• Messaging overheads
Resource ManagerCapacity Scheduler
Node A
(empty container)
Node BNode Manager A
Application Master 1
Node Manager B
Application Master 2
Task (App2)
Preemptor
Releaser
Task (App2)
Local Suspender
Releaser Local Suspender
preempt()
# containers to release
release()suspend
saved state
ask container
Task (App1)
resume()
![Page 29: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/29.jpg)
29
Suspending and Resuming Tasks
• When suspending, we must save enough state to be used when resuming the task.
• By using existing intermediate datawe save small state– Simple– Low overhead
![Page 30: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/30.jpg)
30
Suspending and Resuming Tasks• Existing intermediate data
used– Reduce inputs,
stored at local host– Reduce outputs,
stored on HDFS
• Suspend state saved– Key counter– Reduce input path– Hostname– List of suspended task attempt
IDs
HDFSTask Attempt 1
Inputs
KeyCounter
tmp/task_att_1
tmp/task_att_2
outdir/
(Resumed) Task Attempt 2
Inputs
KeyCounter
(skip)
(Suspended)Container freed,
Suspend state saved
![Page 31: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/31.jpg)
31
Two-level Eviction Policies
• Job-level Eviction– Chooses victim job
• Task level-eviction– Chooses task to evict
Resource ManagerCapacity Scheduler
Node A Node BNode Manager A
Application Master 1
Node Manager B
Application Master 2
Task (App2)
Preemptor
Releaser
Task (App2)
Local Suspender
Releaser Local Suspender
# containers to release
preempt()
release()
![Page 32: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/32.jpg)
32
Task eviction policies• Based on time remaining
– Last task to finish decides job completion time– Task that finishes earlier releases container earlier
• Application Master keeps track of time remaining
• Shortest Remaining Time (SRT) Shortens the tail Holds on to containers that would be released soon
• Longest Remaining Time (LRT) May lengthen the tail Releases containers as soon as possible
![Page 33: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/33.jpg)
33
Job eviction policies• Based on amount of resources (e.g. memory) held by job• Resource Manager holds resource information
• Least Resources (LR) Large jobs benefit Starvation even with small production jobs
• Most Resources (MR) Small jobs benefit Large jobs may be delayed for a long time
• Probabilistically-weighted on Resources (PR) Avoids biasing tasks: chance of eviction for task is same across all jobs, assuming random task eviction policy Many jobs may be delayed
![Page 34: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/34.jpg)
34
Evaluation
• Microbenchmarks• Trace-driven experiments
• Natjam was implemented based on Hadoop 0.23 (YARN)
• 7-node cluster in CCT
![Page 35: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/35.jpg)
35
Microbenchmarks: Setup
• Avg completion times on empty cluster– Research Job: ~200s– Production Job: ~70s
• Job sizes: XL (100% of cluster), L (75%), M (50%), S (25%)
• Task workloads within a job chosen uniformly between range of (1/2 of largest task, largest task]
![Page 36: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/36.jpg)
36
Microbenchmark: Comparing Natjam to other techniques
Ideal Capacity scheduler: Hard cap
Capacity scheduler: Soft cap
Killing Natjam0
50
100
150
200
250
300
350
Research-XL Job Production-S Job
Aver
age
Exec
ution
Tim
e (s
econ
ds)
50% more than ideal
90% more than ideal
20% more than ideal
2% more than ideal15% less than Killing
7% more than ideal40% less than Soft cap
time (seconds)
t=0s Research-XL t=50s Production-S
![Page 37: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/37.jpg)
37
Microbenchmark:Suspend overhead
• 1.25s increase due to messaging delays
• Task assignments happen in parallel: 4.7s increase in job completion time isi. Assign Application Masterii. Assign Map tasksiii. Assign Reduce tasks
01234
Aver
age
Tim
e (s
econ
ds)
1.25 s (50%) increase
![Page 38: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/38.jpg)
38
Microbenchmark:Task eviction policies
Random Longest remaining time
Shortest remaining time
0
50
100
150
200
250
300
Research-XL Job
Aver
age
Exec
ution
Tim
e (s
econ
ds)
17% less than Random
time (seconds)
t=0s Research-XL t=50s Production-S
Theorem 1: When production tasks are the same length,SRT results in shortest job completion time.
![Page 39: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/39.jpg)
39
Microbenchmark:Job eviction policies
Probabilistic Most Resources Least Resources0
50
100
150
200
250
300
Research-L Job Research-S Job
Aver
age
Exec
ution
Tim
e (s
econ
ds)
Most Resources + SRT = good fit
time (seconds)
t=0s Research-LResearch-S
t=50s Production-S
Theorem 2: When tasks within each job are the same length,evicting from the minimum number of jobsresults in the shortest average job completion time.
![Page 40: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/40.jpg)
40
Trace-driven evaluation• Yahoo trace: scaled production cluster workload + scaled research cluster• Job completion times
00:00 10:00 20:00 30:00 40:00 50:00 00:000
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Submission Time (mins:seconds)
Com
pleti
on T
ime
(sec
onds
)
![Page 41: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/41.jpg)
41
Trace-driven evaluation:Research jobs only
00:00 10:00 20:00 30:00 40:00 50:00 00:000
500
1000
1500
2000
2500
3000
NatjamSoft CapKilling
Submission Time (mins:seconds)
Com
pleti
on T
ime
(sec
onds
)
115 seconds
![Page 42: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/42.jpg)
42
Trace-driven evaluation:CDF of differences (negative is good)
-250 -200 -150 -100 -50 0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
Production Jobs: Natjam - Soft Cap
-250 -200 -150 -100 -50 0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
Production Jobs: Natjam - Killing
-1250 -750 -250 250 750 12500
0.2
0.4
0.6
0.8
1
Research Jobs: Natjam - Soft Cap
-250 -200 -150 -100 -50 0 50 100 150 200 2500
0.2
0.4
0.6
0.8
1
Research Jobs: Natjam - Killing
![Page 43: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/43.jpg)
43
Related Work
• Single cluster job scheduling has focused on:– Locality of Map tasks [Quincy, Delay Scheduling]– Speculative execution [LATE Scheduler]– Average fairness between queues [Capacity
Scheduler, Fair Scheduler]– Recent work: Elastic queues [Amoeba]
• We solve the requirement of prioritizing production jobs
![Page 44: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/44.jpg)
44
Natjam summary
• Natjam prioritizes production jobs• Suspend/Resume tasks in research jobs• Eviction policies that choose which tasks to
suspend
• Evaluation– Microbenchmarks– Trace-drive experiments
![Page 45: Satisfying Strong Application Requirements in Data-Intensive Clouds](https://reader036.vdocuments.us/reader036/viewer/2022062501/56816764550346895ddc41bc/html5/thumbnails/45.jpg)
45
Conclusion
Solution Strong user requirement
Key optimized metric
Pandora-A[ICDCS 2010] Deadline Low $ cost
Pandora-B[ICAC 2011] $ Budget Short transfer
time
Natjam Prioritize production jobs
Job completion time
Vivace[USENIX ATC 2012] Consistency Low latency
• Thesis: It is feasible to satisfystrong application requirementsfor data-intensive cloud computing environments, in spite ofresource limitations,while simultaneously optimizingrun-time metrics.
• Contributions: Solutions that reinforce this statement in diverse data-intensive cloud settings.