7 fun things to do with mapreduce chris hillman – teradata data

10
7 Fun Things to do with MapReduce Chris Hillman – Teradata Data Scientist [email protected] @chillax7

Upload: hannah-young

Post on 08-Jan-2018

217 views

Category:

Documents


0 download

DESCRIPTION

Face Detection in Images Step Step 1. Get a good Open Source Library Step 2. Check the Example

TRANSCRIPT

Page 1: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

7 Fun Things to do with MapReduce

Chris Hillman – Teradata Data Scientist [email protected]@chillax7

Page 2: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

AgendaMap Tasks

Face DetectionCharacter RecognitionSpeech to Text

ShufflingMass Spectrometer processing

ReducersText MiningActual Mining

Cluster Building

Page 3: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Face Detection in ImagesStepStep 1.

Get a good Open Source Library

Step 2.Check the

Example Code

@chillax7

Page 4: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Character RecognitionStepMore Complex Task

than Face DetectionSELECT * FROM RecognizeNumberPlate( ON anpr.vehiclelogs imagecol('recognizedobject'));

@chillax7

Page 5: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Speech to TextStep

Fed up with word count examples?

How about counting words in a recorded wav

file?@chillax7

Page 6: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

ProteomicsStepMass Spectrometers

Create a lot of data….In XML format….It’s nasty to work with

@chillax7

Page 7: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Text MiningStepFirst phases are map

tasksText Extraction andParsing

@chillax7

Page 8: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Actual MiningStepComparing Seismic

surveys taken at different points in time??

@chillax7

Page 9: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Cluster BuildingStepWhy Build your own

cluster?• It’s fun• You learn lots• It gets you invited

to parties

Physical or Virtual?

Physical – more fun, looks

impressive, harder to build,

maintain, use, cost of power

Virtual – performance? Easier

to test, try different versions,

configurations@chillax7

Page 10: 7 Fun Things to do with MapReduce Chris Hillman – Teradata Data

Thank youChris [email protected]@chillax7www.bigdatablog.co.uk

=+ +