file and filesystem fragmentation in lustre · 2016. 3. 7. · file and filesystem fragmentation in...
TRANSCRIPT
![Page 1: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/1.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
File and filesystem fragmentation in Lustre Ashley Pittman
![Page 2: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/2.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
What is fragmentation
▶ File fragmentation • Contents of individual file is dispersed over different locations on
the device
▶ Filesystem fragmentation • Available space is dispersed over different locations on the device.
2
![Page 3: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/3.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Fragmentation
3
![Page 4: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/4.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Why is this bad?
▶ Spinning media is good for streaming I/O • But poor for seeks.
▶ With file fragmentation seek performance becomes the factor in dominant I/O performance.
4
![Page 5: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/5.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Assumptions
▶ Fragmentation cost is a function of utilisation level. • Appears to be the case • Will depend hugely on workload
▶ Cost of utilisation is not just fragmentation, but also the time cost of block allocator.
5
![Page 6: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/6.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Single OST performance
6
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
100% Cost of fragmentation
Performance
![Page 7: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/7.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Why is this bad on Lustre?
▶ Parallel writes use many OSTs for performance. ▶ Performance is number of OSTs multiplied by speed of
the slowest OST. • A single slow OST can have a dramatic effect on the overall
bandwidth
▶ Likelihood of at least one OST being slow is probability of an individual OST being slow, raised to the power of the number of OSTs. • Increasing likelihood as OST counts rise.
7
![Page 8: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/8.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Parallel, 60 OST performance
8
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
100% Cost of fragmentation
Performance
![Page 9: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/9.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
OST utilisation - good.
9
![Page 10: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/10.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
OST utilisation - good
10
![Page 11: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/11.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
OST utilisation - bad
11
![Page 12: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/12.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
OST utilisation - ugly
12
![Page 13: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/13.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Quick solutions
▶ Rebalance files • Now possible with 2.4 • Only works with adequate space available.
▶ Reduce usage levels
13
![Page 14: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/14.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Avoidance tips
▶ Overspecify the filesystem. • Buy twice as much space, and use 100% SSDs.
▶ Don’t consume all the space ▶ Reduce individual OST fragmentation by limiting number
of small files ▶ Keep OST space utilisation flat
• Avoid, unstriped files. ▶ Larger block allocation sizes. ▶ Stripe to subset of OSTs?
• Potentially avoiding overly-full OSTs so avoiding worst effects for more bandwidth.
14
![Page 15: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/15.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Is read any better?
▶ Potentailly aoi_read() can avoid the issue. • Smaller reads can complete individually, allowing processing as
the data arrives. • Adds significant complexity to application.
15
![Page 16: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/16.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Hidden problems – existing files
▶ Historic OST fragmentation will lead to residual problems ▶ Hard to identify files ▶ Impossible to benchmark
• Elusive but will affect wall-clock times.
16
![Page 17: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/17.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Detecting problems
▶ filefrag –v <filename> • Shows block ranges used for files. • Can be used to discover if specific files are affected
17
![Page 18: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/18.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Finding at-risk files.
▶ Large files ▶ Probably striped
• If large and not striped possibly part of the problem.
▶ Specific creation date range ▶ List of candidate OST
18
![Page 19: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/19.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Finding at-risk files.
▶ Large files ▶ Probably striped
• If large and not striped possibly part of the problem.
▶ Specific creation date range ▶ List of candidate OST.
POSIX!
19
![Page 20: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/20.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Conclusions
▶ Scaling costs are huge ▶ Best practice can avoid the issue in most cases ▶ Often un-diagnosed
• Better monitoring and awareness
▶ Easy to diagnose ▶ Potential quick-fix for new files ▶ Slow-fix available for existing files
• If you can find them.
▶ Block allocation is a major factor
20
![Page 21: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/21.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
What about ZFS?
▶ Different performance profile • Write policy • COW • Fewer OSTs
▶ Same basic theory applies
21
![Page 22: File and filesystem fragmentation in Lustre · 2016. 3. 7. · File and filesystem fragmentation in Lustre Ashley Pittman apittman@ddn.com ... • Contents of individual file is dispersed](https://reader036.vdocuments.us/reader036/viewer/2022071219/60570d3e62887827a32297cb/html5/thumbnails/22.jpg)
ddn.com ©2012 DataDirect Networks. All Rights Reserved.
Questions?
22