dynamic resource allocation in apache spark
TRANSCRIPT
![Page 1: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/1.jpg)
DynamicResourceAlloca1oninApacheSpark
YutaImai@imai_factory
![Page 2: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/2.jpg)
1.RDDGraphvaltext="HelloSpark,thisismyfirstSparkapplication."valtextArray=text.split("").map(_.replaceAll("",""))valresult=sc.parallelize(textArray).map(item=>(item,1)).reduceByKey((x,y)=>x+y).collect()
![Page 3: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/3.jpg)
Array ArrayParallelCollec1onRDD
Par11on0
Par11on1
Par11on2
Par11on3
MapPar11onsRDD
Par11on0
Par11on1
Par11on2
Par11on3
ShuffledRDD
Par11on0
Par11on1
sc.parallelize() .map(…) .reduceByKey(…) .collect()
2.DAGScheduler
![Page 4: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/4.jpg)
Array ArrayParallelCollec1onRDD
Par11on0
Par11on1
Par11on2
Par11on3
MapPar11onsRDD
Par11on0
Par11on1
Par11on2
Par11on3
ShuffledRDD
Par11on0
Par11on1
sc.parallelize() .map(…) .reduceByKey(…) .collect()
2.DAGScheduler
NarrowDependency ShuffleDependency
![Page 5: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/5.jpg)
Array ArrayParallelCollec1onRDD
Par11on0
Par11on1
Par11on2
Par11on3
MapPar11onsRDD
Par11on0
Par11on1
Par11on2
Par11on3
ShuffledRDD
Par11on0
Par11on1
sc.parallelize() .map(…) .reduceByKey(…) .collect()
2.DAGScheduler
NarrowDependency ShuffleDependency
Stage0 Stage1
Task0
Task1
Task2
Task3
Task4
Task5
![Page 6: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/6.jpg)
3.TaskScheduler
Par11on0
Par11on1
Par11on2
Par11on3
Par11on0
Par11on1
Par11on2
Par11on3
Task0
Task1
Task2
Task3
Executors
![Page 7: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/7.jpg)
ShuffleFile
iterator.map(…).map(...)...
Executor
ThreadStorage
WorkerNode
iterator.map(…).map(...)...
Executor
Thread
WorkerNode
![Page 8: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/8.jpg)
DYNAMICRESOURCEALLOCATION
![Page 9: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/9.jpg)
DynamicResourceAlloca1on• Addsextraexecutorstoanappwhichhaspendingtasks.– Offloadschallengeforexactresourceplanningforanapp.
• Removesidleexecutorsfromanapp.– Helpsalongrunningapptofreeidleexecutors.
![Page 10: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/10.jpg)
Overview
Tasks
Executors
![Page 11: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/11.jpg)
Overview
Tasks
Executors
Insufficientcapacity
![Page 12: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/12.jpg)
Overview
Tasks
Executors
Insufficientcapacity
![Page 13: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/13.jpg)
Overview
Tasks
Executors
Insufficientcapacity
![Page 14: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/14.jpg)
Overview
Tasks
Executors
Insufficientcapacity Op1malcapacity
![Page 15: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/15.jpg)
Overview
Tasks
Executors
✔ ✔
Insufficientcapacity Op1malcapacity Idleexecutors
![Page 16: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/16.jpg)
Tasks
Executors
✔ ✔
Overview
Insufficientcapacity Op1malcapacity Idleexecutors
Op1malcapacity
![Page 17: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/17.jpg)
RequestPolicy• Anappstartswithuserspecified#ofexecutors.
./bin/spark-submit\--class<main-class>--master<master-url>\--num-executors<#ofexecutors>
• Aderspark.dynamicAlloca1on.schedulerBacklogTimeout(sec),Appstartsreques1ngnewexecutors,ifithaspendingtask(s).
• Apprequestsnewexecutorseveryspark.dynamicAlloca1on.sustainedSchedulerBacklogTimeout(sec),withdoubling#ofrequestslike1,2,4,8,16…
![Page 18: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/18.jpg)
RemovePolicy• Anappremovesanexecutorwhenithasbeenidleformore
thanspark.dynamicAlloca1on.executorIdleTimeoutseconds.
![Page 19: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/19.jpg)
ExternalShuffleService
iterator.map(…).map(...)...
Executor
ThreadStorage
WorkerNode
iterator.map(…).map(...)...
Executor
Thread
WorkerNode
![Page 20: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/20.jpg)
ExternalShuffleService
iterator.map(…).map(...)...
Executor
ThreadStorage
WorkerNode
iterator.map(…).map(...)...
Executor
Thread
WorkerNode
![Page 21: Dynamic Resource Allocation in Apache Spark](https://reader034.vdocuments.us/reader034/viewer/2022050614/587fa5041a28ab825e8b6af3/html5/thumbnails/21.jpg)
ExternalShuffleService
iterator.map(…).map(...)...
Executor
ThreadStorage
WorkerNode
iterator.map(…).map(...)...
Executor
Thread
WorkerNode
ShuffleService
ShuffleService