making big data work
Post on 15-Apr-2017
187 Views
Preview:
TRANSCRIPT
Making Big Data workLewis CrawfordPrincipal Architect @ the DataShed
thedatashed.co.uk
Lewis@thedatashed.co.uk
©theDataShedLimited 2015
intro
Who am I?
• Forthelast3years,theDataShed hasbeenprovidingconsultancyservicestoavastarrayoflargeclients.Ourprimaryfocusisensuringthattechnologyandanalyticalstrategiesaretrulyalignedsothatbusinessescanleveragethelatestandgreatestintechnologytomodel,mineanddescribetheirdataasset.
• WewereworkingwithBigDatatechnologybeforethetermwascoined,wehaveexperiencedeliveringanalyticalsystemsdrivenbyPetabytedatasets,andhavedesigned,implementedandsupportedoneofthelargestreal-timedataintegrationandpredictiveanalyticsplatformsintheaviationworld.
• Ourmodelisbasedonusingasmallnumberofexceptionallyhighlyskilledindividualstodeliverdisruptiveandinnovativesolutionsinanagileanddelivery-focusedmanner.
©theDataShedLimited 2015
So what is ‘Big Data’?
©theDataShedLimited 2015
Why do Big Data projects fail?
ToomanypeoplethinkthatBigDatais:
“Thebeliefthatthemoredatayouhave,themoreinsightsandanswerswillriseautomaticallyfromthepoolofonesandzeros.”
GillPress,Forbes.com
©theDataShedLimited 2015
How to make Big Data work?
1. Understandyourproblem
2. Applyappropriatetools
3. Automateeverything.
©theDataShedLimited 2015
Real-time data
©theDataShedLimited 2015
©theDataShedLimited 2015
©theDataShedLimited 2015
Continuous Integration Demo
©theDataShedLimited 2015
How to make Big Data work?
1. Understandyourproblem
2. Applyappropriatetools
3. Automateeverything.
©theDataShedLimited 2015
Little Big Data
©theDataShedLimited 2015
A problem closer to home…
• Everybusinessneedstounderstand:• Theirpotentialcustomersandmarket• Currentcustomers• Theirproductsandsales• Howandwhentheyengageprospectsandcustomers
• Analyticsanddataareexpensive• Manyofthemandatoryelementsareverysimilarforeveryone• TheDataShedisAnalyticsasaServiceandSingleCustomerViewasaService.
©theDataShedLimited 2015
The deduplication problem…
• SMEhas250,000customers(twosystemsofrecord)• Toidentifyduplicatesbruteforceapproach: 31,249,875,000comparisons• Buildingasystemtoprocessaminimumof100clientsaday…• 3.1trillionrecordstocompareusing>10differentalgorithms
• Traditionalscaleupapproachwouldbeexpensive,andmakeslargeassumptionsaroundblockingandpartitioningrules• Asmalldataproblembutabigdatasolution?
Title FirstName Surname Address 1 Address2 Address3
Dr RJ Smith TwoOaks 112OldSt. CountyDurham
Mrs Robyn Smith 112OldStreet Durham DH15YJ
©theDataShedLimited 2015
©theDataShedLimited 2015
The Shed demo
©theDataShedLimited 2015
How to make Big Data work?
1. Understandyourproblem
2. Applyappropriatetools
3. Automateeverything.
©theDataShedLimited 2015
How to make Big Data work?1. Understandyourproblem
• ’BigData’challengesaren’tnecessarilynew,howevermuchofthetechnology is• Articulateandcommunicate– focusondistillingyourproblemdown• Incremental improvementnotwholesalereplacement
2. Applyappropriate tools• Understandtheeconomics aswellasthetechnology• Newtechnologiesneedtobeevaluatedwithinthecontextofyourproblemscope• Newtechnologiesareenablers notdeliverables(#datalake)• ’BigData’technologyshouldbeseenascomplementarytoexistingtechnology
3. Automateeverything• Continuousintegrationtoincludeall testing• Containerisewherepossible• Measureeverything
©theDataShedLimited 2015
If you really want to get involved…
©theDataShedLimited 2015
Get your hands dirty
Ifyou’reinterestedinlearningmore,we’llbehostingahands-onlabseventinthenearfuture.
Sendyourdetailsto:Email:hello@thedatashed.co.ukTwitter:@thedatashed
©theDataShedLimited 2015
Any questions?
©theDataShedLimited 2015
Lewis CrawfordPrincipal Architect @ the DataShed
thedatashed.co.uk
Lewis@thedatashed.co.uk
top related