![Page 1: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/1.jpg)
Steve Crouch, Greg WilsonSoftware Carpentry
Why should scientists understandhow to write better software?
This work is licensed under the Creative Commons Attribution License
Copyright © Software Carpentry and The University of Edinburgh 2011-2012
See http://software-carpentry.org/license.html for more information.
![Page 2: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/2.jpg)
What we think we know…
Once upon a time…
• 7 Years War (1754-63)
• Britain loses 1512 sailors to enemy action…
• …and almost 100,000 to scurvy
![Page 3: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/3.jpg)
What we think we know…
But before then…
Courtesy of the National Library ofMedicine
James Lind (1716-94)
1747: (possibly) first ever controlled medical experiment
Cider Sea water
Sulphuric acid (!) Oranges
Vinegar Barley water
![Page 4: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/4.jpg)
What we think we know…
Oranges (and lemons)…
Courtesy of the National Library ofMedicine
• And no-one listened…he’s not an English gentleman!• But in 1794 the Admiralty gave it a go…• …and England came out on top!
Cider Sea water
Sulphuric acid Oranges
Vinegar Barley water
James Lind (1716-94)
1747: (possibly) first ever controlled medical experiment
![Page 5: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/5.jpg)
What we think we know…
Modern medicine
• Medical profession eventually realised controlled studies were the right way to go
• David Sackett
• Pioneer of “evidence-based medicine”
• Randomised double-blind test is a “gold standard”
• Cochrane Collaboration
• http://www.cochrane.org/
• Largest collection of recordsof randomised controlled trialsin the world
Image courtesy of Lab Science Career
![Page 6: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/6.jpg)
What we think we know…
What about software development?
• Martin Fowler IEEE Software, July/Aug 2009
• “[Using domain specific language] leads to two primary benefits. The first, and simplest, is improved programmer productivity... The second... is... communication with domain experts.”
Image courtesy of adewale_oshineye
![Page 7: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/7.jpg)
What we think we know…
Just one more thing…
• Two substantive claims…
• …without a single citation!
• Where’s the proof?
• “Many software researchers advocate rather than investigate” – Robert L. Glass (2002)
• Par for the course!
Image courtesy of whatleydude
Image courtesy of futureatlas.com
![Page 8: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/8.jpg)
What we think we know…
Times, they are a changin’
• Growing emphasis on empirical studies since the mid-1990s
• Papers describing new tools or practices routinely include results from some kind of field study
• International Conference on Software Engineering
• Empirical Software Engineering
• “It will never work in theory”
![Page 9: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/9.jpg)
What we think we know…
Simply the best?
• “Exploratory experimental studies comparing online and offline programming performance”
• Sackman, Erikson and Grant (1968)
• “The best programmers are up to 28 times more productive than the worst”
• So all we need are a few good people (apparently)…
• 1968!
• Designed to compare batch versus interactive
• 12 programmers for an afternoon
![Page 10: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/10.jpg)
What we think we know…
So what do we know?
• “Some Experience with Automated Aids to the Design of Large-Scale Reliable Software”
• Boehm et al (1975)
• Most errors are introduced during requirements analysis and design
• The later an error is detected the more costly it is to address
• 1 hour to fix in the design
• 10 hours to fix in the code
• 100 hours to fix after it’s gone live…time
num
ber
/ co
st
![Page 11: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/11.jpg)
What we think we know…
Why reading is good…
• Rigorous inspections can remove 60-90% of errors before first test is run• Fagan (1975) “Design and Code
Inspections to Reduce Errors in Program Development"
• The first review and hour matter most• Cohen (2006) “Best Kept Secrets of
Peer Code Review”
![Page 12: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/12.jpg)
What we think we know…
Errors of a feather…• Half the errors are found in 15% of the
modules • Davis (1995) quoting Endres (1975)
• About 80% of the defects come from 20% of the modules, and about half the modules are error free• Boehm and Basili (2001)
• When you identify more errors than expected in some program module, keep looking!
![Page 13: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/13.jpg)
What we think we know…
How did you learn to program?• You don’t just go out and write
War and Peace
• You’d read other skilled writers first
• You don’t just go out and write a foreign language
• You learn to read it first
• Teach maintenance first
• Harlan Mills (1990)
Image courtesy of dwyman
“The best way to prepare [to be a programmer] is to write programs and to study great programs that other people have written” Susan Lammers, 1986, quoting a younger Bill Gates
![Page 14: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/14.jpg)
What we think we know…
Beware false prophets!
• “Anchoring and Adjustment in Software Estimation”
• Aranda & Easterbrook (2005)
• “How long do you think it will take to make a change to this program?”
Control Group: “I’d like to give an estimate for this project myself, but I admit I have no experience estimating. We’ll wait for your calculations for an estimate.”
Group A: “I admit I have no experience with software projects, but I guess this will take about 2 months to finish.”
Group B: “… I guess this will take about 20 months to finish.”
![Page 15: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/15.jpg)
What we think we know…
Anchors drag you down
• Novice’s estimate mattered more than…• Experience in software engineering• Tools used• Formality of estimation
Group A (lowball) 5.1 months
Control Group 7.8 months
Group B (highball) 15.4 months
![Page 16: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/16.jpg)
What we think we know…
Working remotely
• Physical distance doesn’t affect post-release fault rates
• …distance in organisational chart does
• Nagappan et al (2007) & Bird et al (2009)
![Page 17: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/17.jpg)
What we think we know…
And some more…
• For every 20% increase in problem complexity, there is a 100% increase in solution complexity
• Woodfield (1979)
• The two biggest causes of project failure are poor estimation and unstable requirements
• van Genuchten (1991) and many others
• Development teams do not benefit from existing experience, and theyrepeat mistakes over and over again
• Brossler (1999)
![Page 18: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/18.jpg)
What we think we know…
And there’s many more…!
http://software-carpentry.org/4_0/softeng/ebse
![Page 19: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/19.jpg)
What we think we know…
So…
Shouldn’t our development practices be built around these, and other, facts?
+
![Page 20: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/20.jpg)
What we think we know…
What is Software Carpentryand how to get involved?
![Page 21: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/21.jpg)
What we think we know…
Software Carpentry: what is it?
Goal: Make scientists and engineersmore productive
• Teach them basic computing skills
• Things they should know before they start
• Few days of training
• Saves researchers a day a week
• Improves quality of computational work
• Empirically based, not advocacy based
• Skills: how they contribute to correct, reproducible, reusable research
![Page 22: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/22.jpg)
What we think we know…
A bit of history…
http://www.software-carpentry.org
• Started by Greg Wilson in 1998
• Gone through five iterations of improvement
• All content under Creative Commons licence
• Become international initiative
• Worldwide contributors
• Worldwide training events
• Always looking for more contributors!
![Page 23: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/23.jpg)
What we think we know…
What is a boot camp?• In-person, example driven workshop
• 2-3 days• Usually 20-40 people• 2-3 instructors + helpers
• Core skills to be productivein research team• Basic programming• Version control• Testing• …
• Short tutorials & hands-on exercises• Participants help each other
![Page 24: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/24.jpg)
Python Introduction
Software Carpentry –Example Agenda
Monday May 14th
08:30 - 09:00 Registration
09:00 - 09:15 Introduction Clive, Steve, Neil
09:15 - 09:45 What we actually know about developing software, and why
we believe it's true Steve C, Secondary Steve M
09:45 - 12:30 The Unix shell Steve M, Secondary Mike
12:30 - 13:30 Lunch
13:30 - 15:30 Version control Chris, Secondary Mike
15:30 - 17:00 The basics of Python Mike, Secondary Steve M
17:00 19:00 Gathering at the Northern Stage
Tuesday May 15
09:00 - 12:30 Continuing Python programming
- Functions - Steve C, Secondary Steve M
- Testing - Mike, Secondary Steve C
12:30 - 13:30 Lunch
13:30 - 15:30 Using relational databases Steve C, Secondary Steve M
15:30 - 17:00 Putting it all together Mike, Secondary Steve C
SSI supporting
SSI leading
![Page 25: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/25.jpg)
What we think we know…
Why a boot camp?• Recent research: students learn best in
blended environment, combining:
• Directed in-person learning
• Self-directed online learning
• Researchers are busy people! 2-3 days ok
• Hard to stay motivated learning in isolation
• Boot camps create peer support communities
• In selected disciplines or geographic regions
![Page 26: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/26.jpg)
What we think we know…
Why get involved?• Critical realisation that scientists need to write
better software
• Reproducibility in data, software next in line?
• Be at the forefront
• Great way to get teaching experience
• Learn new techniques
• Networking, trips to nice places!
• Be a local expert
• Looks good on a CV
![Page 27: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons](https://reader035.vdocuments.us/reader035/viewer/2022070412/56649e165503460f94b00616/html5/thumbnails/27.jpg)
Want to get involved?
• Write online lectures and practicals
• Want to be a helper…• …or instructor?• Let us know!
• Get in touch!• [email protected]• [email protected]