big data little tests - agile alliance...big data! little tests" " john heintz"...
TRANSCRIPT
![Page 1: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/1.jpg)
Big Data���Little Tests
John Heintz
Founder, Gist Labs Technical Consultant, Cutter Consortium
[email protected] @jheintz
http://gistlabs.com
![Page 2: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/2.jpg)
© 2012 Gist Labs, LLC
About John Heintz • Developer since 1995
• Agilist since 1999
• Founded Gist Labs in 2008
• Developer, Mentor, Consultant
• Intuitive, Abstract, Precise
2
Kool-Aids I’ve drank: Agile/Lean/Kanban, OO, TDD, REST, Mentoring, Craftsmanship, Emergent/Progressive Design, InnovationGames®, Systems and Complexity Theory
![Page 3: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/3.jpg)
© 2012 Gist Labs, LLC
My Goals for You
• Demystify test automation for Big Data
• Provide executable examples
3
![Page 4: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/4.jpg)
© 2012 Gist Labs, LLC
What you shouldn’t expect…
• Barely introduce Big Data concepts
• No performance tuning
4
![Page 5: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/5.jpg)
© 2012 Gist Labs, LLC
Simple Code, Config
• I went as simple and clear as possible
• Java, JUnit4
• Maven… okay maybe not simple :-\
5
![Page 6: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/6.jpg)
© 2012 Gist Labs, LLC
Mostly Code
• Remember the Law of Two Feet
• If code isn’t what you were looking for I totally respect you finding something better for your time J
6
![Page 7: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/7.jpg)
© 2012 Gist Labs, LLC
• Everything available from http://gistlabs.com/2012/08/big-data-little-tests/
• The entire command script is there…
so you can take notes assuming that’s available
7
![Page 8: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/8.jpg)
© 2012 Gist Labs, LLC
My Soapboxes…
These are topics I’ll repeat myself on
• Fast test execution
• One-click build
8
![Page 9: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/9.jpg)
© 2012 Gist Labs, LLC
Big Data
• Too much
• Too fast
• Not trivially structured
9
![Page 10: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/10.jpg)
© 2012 Gist Labs, LLC
Map Reduce
• Map from one input to one output
• Reduce from many inputs to one output
• Can be run in parallel
• Crude, but massive
10
![Page 11: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/11.jpg)
© 2012 Gist Labs, LLC
CAP Theorem
• Consistency
• Availability
• Partition Tolerance
11
![Page 12: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/12.jpg)
© 2012 Gist Labs, LLC
Big Data Ecosystem
• Hadoop: A giant among giants
(Tons of projects on this platform!!)
• Cassandra: Feels like a weird RDBMS
• Riak: An elegant key/value/search store
• MongoDB: Document store
12
![Page 13: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/13.jpg)
© 2012 Gist Labs, LLC
Let’s Run Some Code
13
![Page 14: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/14.jpg)
© 2012 Gist Labs, LLC
Hadoop Tests
14
![Page 15: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/15.jpg)
© 2012 Gist Labs, LLC
Riak tests
15
![Page 16: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/16.jpg)
© 2012 Gist Labs, LLC
Other Frameworks
• CassandraUnit
https://github.com/jsevellec/cassandra-unit
• PigUnit, Hadoop Query Language
http://pig.apache.org/docs/r0.8.1/pigunit.html
16
![Page 17: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/17.jpg)
© 2012 Gist Labs, LLC
Code Questions?
• Fast test execution?
• One-click build?
17
![Page 18: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/18.jpg)
© 2012 Gist Labs, LLC
What about Big Tests?
• Real test data
• Realistic cluster
18
![Page 19: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/19.jpg)
© 2012 Gist Labs, LLC
Real Test Data
My favorite strategy is to:
• Develop with small, crafted data
• Build/test the same way
• Run another test on top of real prod data
19
![Page 20: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/20.jpg)
© 2012 Gist Labs, LLC
Continuous Deployment Servers
Build
Cluster
Test1
Cluster
Version Control
Staging
Production
Continuous Integration Servers
Developers
Developers
Test2
Cluster
Virtual vs Physical Servers
Network Infrastructure
Storage Infrastructure
Developer Sandboxes
Self-service Provisioning
Private vs Public Cloud
20
![Page 21: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/21.jpg)
© 2012 Gist Labs, LLC
Realistic Cluster
• Use a CI/DevOps environment
• Virtualize, “X as a Service”
• Virtual Machines
• Virtual Infrastructure (Network, Storage)
21
![Page 22: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/22.jpg)
© 2012 Gist Labs, LLC
Jenkins CI Server • Master/slave clusters
• Plugins for Hadoop and VMWare
• http://jenkins-ci.org/
22
![Page 23: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/23.jpg)
© 2012 Gist Labs, LLC
Big Questions?
23
![Page 24: Big Data Little Tests - Agile Alliance...Big Data! Little Tests" " John Heintz" Founder, Gist Labs" Technical Consultant, Cutter Consortium" " john@gistlabs.com @jheintz" " " © 2012](https://reader033.vdocuments.us/reader033/viewer/2022060318/5f0c78187e708231d4358ecb/html5/thumbnails/24.jpg)
© 2012 Gist Labs, LLC
Thank you!
• Everything available from:
http://gistlabs.com/2012/08/big-data-little-tests/
• John Heintz, @jheintz, http://gistlabs.com
24