the mythology of big data
DESCRIPTION
Keynote presentation on some of the assumptions about "big data" and how we use information in organizations, given at the O'Reilly Strata conference, February 2011 Video can be viewed on YouTube at http://www.youtube.com/watch?v=HwVPxYWDO4w Note: I had to cut the story about payment and whiskey for time, which is what the video reference to booze is about. Slide 22 is where the embedded video is located if you want to watch it here.TRANSCRIPT
The Mythology
of Big Data
O’Reilly Strata Conference
February 2, 2011
Mark R. Madsen http://ThirdNature.net
@markmadsen
!"#$%&'#()*+,+-%&(.$$/#0&
1/')/*&/'0#,2&')#&0##30&
+2&/'0&+1*&3#0'$4(5+*6&
7+3#&/0&.&(+88+3/'%&
http://www.flickr.com/photos/ecstaticist/1120119742/
9).':0&')#&(#*'$.,&8%')&4*3#$,%/*-&;/-&3.'.<&
=)#&8%')&').'&3$+"#&')#&-+,3&$40)&
All we need is a fat
pipe and pans
working in parallel…
!"#$%&'()*$'($"+)$,-$'%.()$/01&2$1&+"#)&$"1&*+32$("1$'4"(*5$
!"+,45+*&+2&3.'.&
!"#$%"#&'()*)')#'+,-(./*'
0"#$1"#&'()*)')#'23+,-(./*'
4"#$""#&'()*)')#')##5*'
6"7"#'8&'()*)')#'#.2#*,)*5'
6&*$+*'4$7'1'$+*8"4#."($03$0($
,#30(*33$31+#%1#+*$'(7$
9+"%*33*3$'(7$&"/$1&*-$#3*$
0(:"+;'."(5$
!"#$%')/*-&/0&0+&3/>#$#*'&*+1?&
Your grandmother, the data scientist.
@.*%&(4$$#*'&.AA$+.()#0&8/00&')#&A+/*'&
B0/*-&C/-&D.'.&
E':0&*+'&.;+4'&F;/-G&
B0/*-&C/-&D.'.&
And “big” is often not as big as you think it is.
E':0&*+'&$#.,,%&.;+4'&3.'.H&#/')#$&
B0/*-&C/-&D.'.&
If there’s no process for applying information in a specific
context then you are producing expensive trivia.
9)#$#&3+#0&')#&".,4#&/*&3.'.&(+8#&2$+8<&
9-,':-#*'-;'.#'<='=-=$()*)'
2.#<=5##5#>'*?<#'*,)=#@)*5#'
*-'<="/$%'($/*$#3*$
0(:"+;'."($1"$0;9+"8*$
1&*$7*%030"(3$;'7*$0($"#+$
"+)'(0>'."(?@$
A5'=55('*-';-/.#'-='*?)*'
#<=B.@),@3'2)('(5/<#<-='
:)C<=B'5=D*3>'*?5'B,-.+E'
F,B)=<G)D-=#'#55:'*-'
):+@<;3'<==)*5'(5/<#<-='
:)C<=B'H)I#E'
D#(/0/+*I8.J/*-&$#.,/5#0&
J?5'-+5,)D=B':-(5@'<='#5=<-,'
:)=)B5:5=*'<#'+,<:),<@3'<=*.<D-=')=('
+)K5,=$2)#5(E'
J?5':-(5';-,':<((@5':)=)B5:5=*'<#'
+-@<D/)@>'2.,5)./,)D/E'
L5I'()*)'<#'(5#*)2<@<G<=B>'I?</?'<#'I?3'
3-.':)3'?<*')'I)@@'*,3<=B'*-'+.#?'3-.,'
()*)$(,<M5=')B5=()E'
N)*)'<#'/-=*5O*.)@>'#-'I5'=55('#*-,<5#'
*-'5O+@)<='?-I'I5'*?<=C'*?5'I-,@('
I-,C#>'I?3':3'()*)'<#'25K5,'*?)='
3-.,#>')=('I?3'3-.,'*?5-,3'#./C#E'
P-B=<DM5'2<)#'/,5)*5#')':-,)##';-,'
<=*5,+,5*)D-=E'
K&"#$%&.;0'$.('&;40/*#00&/*'#,,/-#*(#&8+3#,&
A?-'),5'*?5'+5-+@5':)C<=B'(5/<#<-=#Q'
R*,)*5B</'
J)/D/)@'
F+5,)D-=)@'
9).'&/0&')#&*.'4$#&+2&')#/$&3#(/0/+*0<&
R/-+5>'D:5';,):5'-;'(5/<#<-=>'D:5'#/)@5'-;'()*)>'()*)'
M-@.:5>'2,5)(*?'-;'()*)>';,5S.5=/3>'+)K5,='M#';)/*$2)#5('
R*,)*5B</'
J)/D/)@'
F+5,)D-=)@'
T-=*?#'
N)3#$A55C#'
T<=#$N)3#'
U! L.M#$*$2)#5('
U! C$+.3&0(+A#&
U! N.('I;.0#3&
U! @+3#$.'#&
0(+A#&
U! O4,#I;.0#3&
U! P.$$+1&0(+A#&
Analy
tic c
om
ple
xity
=)#&A$+(#00&.0A#('&+2&3#(/0/+*0&5#0&'+&A#+A,#&
R/-+5'-;'/-=*,-@';-,'+5-+@5'<=':-#*'-,B)=<G)D-=#')@<B=#&'
<='+,-/5##>'-='+,-/5##>'-M5,'+,-/5##'
R*,)*5B</'
J)/D/)@'
F+5,)D-=)@'
The exceptions not handled at one level due to rule /
procedure / policy deficiency are escalated to the next.
9).'&J/*3&+2&04AA+$'&3+&')#%&)."#&'+3.%<&
R*,)*5B</'
J)/D/)@'
F+5,)D-=)@'
Other people
Email, meetings
Reports, dashboards
Realm of traditional BI
Reality of most reports and dashboards is that they
provide basic monitoring at best.
R*,)*5B</'
J)/D/)@'
F+5,)D-=)@'
Q+1&.*3&1)#$#&(.*&%+4&.AA,%&3.'.&0+,45+*0<&
V<B?'#<=B@5'M)@.5>'@5##'
;,5S.5=*>'#-'<:+,-M5'*?5'
5W5/DM5=5##'-;'<=(<M<(.)@'
(5/<#<-=#E'
A#>>-$;0774*$)+"#(7$
X-I'#<=B@5'M)@.5>';,5S.5=*>'
/)='<:+,-M5'*?5'5Y/<5=/3'
-,'*?5'5W5/DM5=5##';-,'@),B5'
)BB,5B)*5'<:+,-M5:5=*E'
Ana
lytic c
om
ple
xity
9).'&3+&A#+A,#&3+&1/')&3.'.<&
R6! D#0($/;#&'.#5'()*)'*-'/?),)/*5,<G5')'/.,,5=*'-,'+,<-,'#*)*5'-;'*?5'
#3#*5:>';-,'5O):+@5':-=<*-,<=B')=('<(5=D;3<=B'5O/5+D-=#'
S6! E*"#05-.'#&'5O+@-,5'()*)'*-'(<#/-M5,'*?5'2-.=(),<5#')=('
/?),)/*5,<#D/#'-;')'#3#*5:>';,):5')'+,-2@5:'-,'Z=('
#.++-,D=B'['(<#/,5(<D=B'5M<(5=/5E'
T6! !UA,./*&'.#5'()*)')=(')=)@3D/':5*?-(#'*-'(5*5,:<=5'/).#5#'
)=('5W5/*#>'2.<@(':-(5@#')=('/-=#*,./*'#*-,<5#E'
V6! L$#3/('&')++@3')=)@3D/':-(5@#'*-'(5*5,:<=5'+-##<2@5'['+,-2)2@5'
;.*.,5'#*)*5#'-;'*?5'#3#*5:'
W6! L$#0($/;#&'.#5'()*)'<=':-(5@#'*-'(5Z=5'+-@</3>'+,-/5(.,5>')=('
,.@5#';-,'*)C<=B')/D-=>')=('+-##<2@3').*-:)*5'*?5:'
B'1'$0(:+'31+#%1#+*$'(7$1""4$3#99"+1$:"+$1&*3*$'%.80.*3$0($;"31$
"+)'(0>'."(3$03$#(*8*($'1$,*312$7*%+*'30()$'3$-"#$;"8*$7"/(5$
A0)#+*C$D0+"440$'(7$E'+72$FGGH$!>+$'&
X'$4('4$#&
!"#$%&#'()*#*%#+,#(#-(*(#./0,)1.*2#%3#+&04-#.%5'(3,#*%#.&66%3*#*7,82#3,(-#*70.#6(6,3##
\]'*--@:)C5,'#.//55(#')#>')=('-=@3')#>'*?5'#3*+3$-;'?<#'
*--@#'#.//55('I<*?'?<#')<(E'V-I5M5,'#?<=<=B'*?5'2@)(5>'
?-I5M5,'^5I5@5('*?5'?<@*>'?-I5M5,'+5,;5/*'*?5'?5_>')'
#I-,('<#'*5#*5('-=@3'23'/.`=BE'J?)*'#I-,(#:<*?'<#'
#.//5##;.@'I?-#5'/@<5=*#'(<5'-;'-@(')B5Ea'
' 'A+*7*+0%I$J+""I3$
About the Presenter
Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, analytics and performance management. Mark is an award-winning author, architect and former CTO whose work has been featured in numerous industry publications. During his career Mark received awards from the American Productivity & Quality Center, TDWI, Computerworld and the Smithsonian Institute. He is an international speaker, contributing editor at Intelligent Enterprise, and manages the open source channel at the Business Intelligence Network. For more information or to contact Mark, visit http://ThirdNature.net.
About Third Nature
Third Nature is a research and consulting firm focused on new and
emerging technology and practices in business intelligence, data
integration and information management. If your question is related to BI,
open source, web 2.0 or data integration then you‘re at the right place.
Our goal is to help companies take advantage of information-driven
management practices and applications. We offer education, consulting
and research services to support business and IT organizations as well as
technology vendors.
We fill the gap between what the industry analyst firms cover and what IT
needs. We specialize in product and technology analysis, so we look at
emerging technologies and markets, evaluating the products rather than
vendor market positions.