Обзор hadoop-дистрибутивов. Тюнинг «узких мест» hadoop
DESCRIPTION
Алексей Демин (Senior R&D Engineer в Altoros) «Обзор Hadoop-дистрибутивов. Тюнинг «узких мест» Hadoop»TRANSCRIPT
© ALTOROS Systems | CONFIDENTIAL
Diomin AliakseyR&D
2013, Minsk
© ALTOROS Systems | CONFIDENTIAL
© ALTOROS Systems | CONFIDENTIAL 3
OpenSource Monitoring Target Group
Apache Hadoop Yes X Developers
Cloudera Yes Good All
Hortonwork Yes Good All
MapR No Bad Enterprise
PivitalHD No Bad Enterprise
© ALTOROS Systems | CONFIDENTIAL 4
How to find bottleneck?
© ALTOROS Systems | CONFIDENTIAL 5
© ALTOROS Systems | CONFIDENTIAL 6
© ALTOROS Systems | CONFIDENTIAL
© ALTOROS Systems | CONFIDENTIAL 8
© ALTOROS Systems | CONFIDENTIAL 9
© ALTOROS Systems | CONFIDENTIAL 10
© ALTOROS Systems | CONFIDENTIAL 11
© ALTOROS Systems | CONFIDENTIAL 12
© ALTOROS Systems | CONFIDENTIAL 13
1. Increase size of cluster
2. Increase input block size
3. Increase buffer size
© ALTOROS Systems | CONFIDENTIAL 14
1. Increase size of cluster
2. Increase input block size
3. Increase buffer size
© ALTOROS Systems | CONFIDENTIAL 15
© ALTOROS Systems | CONFIDENTIAL 16
© ALTOROS Systems | CONFIDENTIAL 17
© ALTOROS Systems | CONFIDENTIAL 18
1. Increase size of cluster
2. Increase input block size
3. Increase buffer size
© ALTOROS Systems | CONFIDENTIAL 19
© ALTOROS Systems | CONFIDENTIAL 20
1. Increase size of cluster
2. Increase input block size
3. Increase buffer size
© ALTOROS Systems | CONFIDENTIAL 21
1. Compression
© ALTOROS Systems | CONFIDENTIAL 22
1. Compression
2. Combiner
© ALTOROS Systems | CONFIDENTIAL 23
Wordcount
Reduce function as Combine
combine 1: <a, 1> <b, 1> <a, 1> => <a, 2> <b, 1>
combine 2: <a, 1> <b, 1> => <a, 1> <b, 1>
Reduce: <a, {1, 2}> <b, {1, 1}> => <a, 3> <b, 2>
© ALTOROS Systems | CONFIDENTIAL 24
Mean
combine 1: <k,40> <k,30> <k,20> => <k, 30>
combine 2: <k,2> <k,8> => <k, 5>
Reduce: <k, {30, 5}> => <k, 17.5>
© ALTOROS Systems | CONFIDENTIAL 25
Mean
combine 1: <k,40> <k,30> <k,20> => <k, 30>
combine 2: <k,2> <k,8> => <k, 5>
Reduce: <k, {30, 5}> => <k, 17.5>
(40 + 30 + 20 + 2 + 8)/5 = 17.5
© ALTOROS Systems | CONFIDENTIAL 26
Mean
combine 1:
<k,<40,1>> <k,<30,1>>, <k,<20,1>> => <k, <90,3> >
combine 2:
<k,<2,1>> <k, <8,1>> => <k, <10, 2> >
Reduce: <k, {<90,3>, <10,2>} > => <k, 20>
© ALTOROS Systems | CONFIDENTIAL 27