Transcript

2

How to turbo charge your data transfers with WebHDFS

Andy Done, Data Platform Lead [email protected]

Last time…

Since then…

100 40

Hadoop

1 0.5

Storage

15 10

Events

10 4

ExaSol

2.5 6

Load times

Problem WebHDFS

12

Old way WebHDFS

Old way hadoop fs –cat /some/path/* | bulk_load my_table

WebHDFS

WebHDFS way WebHDFS

WebHDFS way IMPORT INTO TABLE my_table FROM

FILE ‘http://namenode/webhdfs/v1/some/path/file_1’

FILE ‘http://namenode/webhdfs/v1/some/path/file_2’

FILE ‘http://namenode/webhdfs/v1/some/path/file_n’

WebHDFS

WebHDFS benefits •  Simple •  Efficient •  Ubiquitous •  Parallelisable •  Bidirectional •  Fast

WebHDFS

18

Conclusion WebHDFS

Thank you

19

We're hiring!

20


Top Related