making big data work

Post on 15-Apr-2017

187 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Making Big Data workLewis CrawfordPrincipal Architect @ the DataShed

thedatashed.co.uk

Lewis@thedatashed.co.uk

©theDataShedLimited 2015

intro

Who am I?

• Forthelast3years,theDataShed hasbeenprovidingconsultancyservicestoavastarrayoflargeclients.Ourprimaryfocusisensuringthattechnologyandanalyticalstrategiesaretrulyalignedsothatbusinessescanleveragethelatestandgreatestintechnologytomodel,mineanddescribetheirdataasset.

• WewereworkingwithBigDatatechnologybeforethetermwascoined,wehaveexperiencedeliveringanalyticalsystemsdrivenbyPetabytedatasets,andhavedesigned,implementedandsupportedoneofthelargestreal-timedataintegrationandpredictiveanalyticsplatformsintheaviationworld.

• Ourmodelisbasedonusingasmallnumberofexceptionallyhighlyskilledindividualstodeliverdisruptiveandinnovativesolutionsinanagileanddelivery-focusedmanner.

©theDataShedLimited 2015

So what is ‘Big Data’?

©theDataShedLimited 2015

Why do Big Data projects fail?

ToomanypeoplethinkthatBigDatais:

“Thebeliefthatthemoredatayouhave,themoreinsightsandanswerswillriseautomaticallyfromthepoolofonesandzeros.”

GillPress,Forbes.com

©theDataShedLimited 2015

How to make Big Data work?

1. Understandyourproblem

2. Applyappropriatetools

3. Automateeverything.

©theDataShedLimited 2015

Real-time data

©theDataShedLimited 2015

©theDataShedLimited 2015

©theDataShedLimited 2015

Continuous Integration Demo

©theDataShedLimited 2015

How to make Big Data work?

1. Understandyourproblem

2. Applyappropriatetools

3. Automateeverything.

©theDataShedLimited 2015

Little Big Data

©theDataShedLimited 2015

A problem closer to home…

• Everybusinessneedstounderstand:• Theirpotentialcustomersandmarket• Currentcustomers• Theirproductsandsales• Howandwhentheyengageprospectsandcustomers

• Analyticsanddataareexpensive• Manyofthemandatoryelementsareverysimilarforeveryone• TheDataShedisAnalyticsasaServiceandSingleCustomerViewasaService.

©theDataShedLimited 2015

The deduplication problem…

• SMEhas250,000customers(twosystemsofrecord)• Toidentifyduplicatesbruteforceapproach: 31,249,875,000comparisons• Buildingasystemtoprocessaminimumof100clientsaday…• 3.1trillionrecordstocompareusing>10differentalgorithms

• Traditionalscaleupapproachwouldbeexpensive,andmakeslargeassumptionsaroundblockingandpartitioningrules• Asmalldataproblembutabigdatasolution?

Title FirstName Surname Address 1 Address2 Address3

Dr RJ Smith TwoOaks 112OldSt. CountyDurham

Mrs Robyn Smith 112OldStreet Durham DH15YJ

©theDataShedLimited 2015

©theDataShedLimited 2015

The Shed demo

©theDataShedLimited 2015

How to make Big Data work?

1. Understandyourproblem

2. Applyappropriatetools

3. Automateeverything.

©theDataShedLimited 2015

How to make Big Data work?1. Understandyourproblem

• ’BigData’challengesaren’tnecessarilynew,howevermuchofthetechnology is• Articulateandcommunicate– focusondistillingyourproblemdown• Incremental improvementnotwholesalereplacement

2. Applyappropriate tools• Understandtheeconomics aswellasthetechnology• Newtechnologiesneedtobeevaluatedwithinthecontextofyourproblemscope• Newtechnologiesareenablers notdeliverables(#datalake)• ’BigData’technologyshouldbeseenascomplementarytoexistingtechnology

3. Automateeverything• Continuousintegrationtoincludeall testing• Containerisewherepossible• Measureeverything

©theDataShedLimited 2015

If you really want to get involved…

©theDataShedLimited 2015

Get your hands dirty

Ifyou’reinterestedinlearningmore,we’llbehostingahands-onlabseventinthenearfuture.

Sendyourdetailsto:Email:hello@thedatashed.co.ukTwitter:@thedatashed

©theDataShedLimited 2015

Any questions?

©theDataShedLimited 2015

Lewis CrawfordPrincipal Architect @ the DataShed

thedatashed.co.uk

Lewis@thedatashed.co.uk

top related