steve crouch, greg wilson software carpentry why should scientists understand how to write better...

27
Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons Attribution Lice Copyright © Software Carpentry and The University of Edinburgh 20 See http://software-carpentry.org/license.html for more informati

Upload: emily-lesley-scott

Post on 16-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

Steve Crouch, Greg WilsonSoftware Carpentry

Why should scientists understandhow to write better software?

This work is licensed under the Creative Commons Attribution License

Copyright © Software Carpentry and The University of Edinburgh 2011-2012

See http://software-carpentry.org/license.html for more information.

Page 2: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Once upon a time…

• 7 Years War (1754-63)

• Britain loses 1512 sailors to enemy action…

• …and almost 100,000 to scurvy

Page 3: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

But before then…

Courtesy of the National Library ofMedicine

James Lind (1716-94)

1747: (possibly) first ever controlled medical experiment

Cider Sea water

Sulphuric acid (!) Oranges

Vinegar Barley water

Page 4: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Oranges (and lemons)…

Courtesy of the National Library ofMedicine

• And no-one listened…he’s not an English gentleman!• But in 1794 the Admiralty gave it a go…• …and England came out on top!

Cider Sea water

Sulphuric acid Oranges

Vinegar Barley water

James Lind (1716-94)

1747: (possibly) first ever controlled medical experiment

Page 5: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Modern medicine

• Medical profession eventually realised controlled studies were the right way to go

• David Sackett

• Pioneer of “evidence-based medicine”

• Randomised double-blind test is a “gold standard”

• Cochrane Collaboration

• http://www.cochrane.org/

• Largest collection of recordsof randomised controlled trialsin the world

Image courtesy of Lab Science Career

Page 6: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

What about software development?

• Martin Fowler IEEE Software, July/Aug 2009

• “[Using domain specific language] leads to two primary benefits. The first, and simplest, is improved programmer productivity... The second... is... communication with domain experts.”

Image courtesy of adewale_oshineye

Page 7: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Just one more thing…

• Two substantive claims…

• …without a single citation!

• Where’s the proof?

• “Many software researchers advocate rather than investigate” – Robert L. Glass (2002)

• Par for the course!

Image courtesy of whatleydude

Image courtesy of futureatlas.com

Page 8: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Times, they are a changin’

• Growing emphasis on empirical studies since the mid-1990s

• Papers describing new tools or practices routinely include results from some kind of field study

• International Conference on Software Engineering

• Empirical Software Engineering

• “It will never work in theory”

Page 9: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Simply the best?

• “Exploratory experimental studies comparing online and offline programming performance”

• Sackman, Erikson and Grant (1968)

• “The best programmers are up to 28 times more productive than the worst”

• So all we need are a few good people (apparently)…

• 1968!

• Designed to compare batch versus interactive

• 12 programmers for an afternoon

Page 10: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

So what do we know?

• “Some Experience with Automated Aids to the Design of Large-Scale Reliable Software”

• Boehm et al (1975)

• Most errors are introduced during requirements analysis and design

• The later an error is detected the more costly it is to address

• 1 hour to fix in the design

• 10 hours to fix in the code

• 100 hours to fix after it’s gone live…time

num

ber

/ co

st

Page 11: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Why reading is good…

• Rigorous inspections can remove 60-90% of errors before first test is run• Fagan (1975) “Design and Code

Inspections to Reduce Errors in Program Development"

• The first review and hour matter most• Cohen (2006) “Best Kept Secrets of

Peer Code Review”

Page 12: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Errors of a feather…• Half the errors are found in 15% of the

modules • Davis (1995) quoting Endres (1975)

• About 80% of the defects come from 20% of the modules, and about half the modules are error free• Boehm and Basili (2001)

• When you identify more errors than expected in some program module, keep looking!

Page 13: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

How did you learn to program?• You don’t just go out and write

War and Peace

• You’d read other skilled writers first

• You don’t just go out and write a foreign language

• You learn to read it first

• Teach maintenance first

• Harlan Mills (1990)

Image courtesy of dwyman

“The best way to prepare [to be a programmer] is to write programs and to study great programs that other people have written” Susan Lammers, 1986, quoting a younger Bill Gates

Page 14: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Beware false prophets!

• “Anchoring and Adjustment in Software Estimation”

• Aranda & Easterbrook (2005)

• “How long do you think it will take to make a change to this program?”

Control Group: “I’d like to give an estimate for this project myself, but I admit I have no experience estimating. We’ll wait for your calculations for an estimate.”

Group A: “I admit I have no experience with software projects, but I guess this will take about 2 months to finish.”

Group B: “… I guess this will take about 20 months to finish.”

Page 15: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Anchors drag you down

• Novice’s estimate mattered more than…• Experience in software engineering• Tools used• Formality of estimation

Group A (lowball) 5.1 months

Control Group 7.8 months

Group B (highball) 15.4 months

Page 16: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Working remotely

• Physical distance doesn’t affect post-release fault rates

• …distance in organisational chart does

• Nagappan et al (2007) & Bird et al (2009)

Page 17: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

And some more…

• For every 20% increase in problem complexity, there is a 100% increase in solution complexity

• Woodfield (1979)

• The two biggest causes of project failure are poor estimation and unstable requirements

• van Genuchten (1991) and many others

• Development teams do not benefit from existing experience, and theyrepeat mistakes over and over again

• Brossler (1999)

Page 18: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

And there’s many more…!

http://software-carpentry.org/4_0/softeng/ebse

Page 19: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

So…

Shouldn’t our development practices be built around these, and other, facts?

+

Page 20: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

What is Software Carpentryand how to get involved?

Page 21: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Software Carpentry: what is it?

Goal: Make scientists and engineersmore productive

• Teach them basic computing skills

• Things they should know before they start

• Few days of training

• Saves researchers a day a week

• Improves quality of computational work

• Empirically based, not advocacy based

• Skills: how they contribute to correct, reproducible, reusable research

Page 22: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

A bit of history…

http://www.software-carpentry.org

• Started by Greg Wilson in 1998

• Gone through five iterations of improvement

• All content under Creative Commons licence

• Become international initiative

• Worldwide contributors

• Worldwide training events

• Always looking for more contributors!

Page 23: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

What is a boot camp?• In-person, example driven workshop

• 2-3 days• Usually 20-40 people• 2-3 instructors + helpers

• Core skills to be productivein research team• Basic programming• Version control• Testing• …

• Short tutorials & hands-on exercises• Participants help each other

Page 24: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

Python Introduction

Software Carpentry –Example Agenda

Monday May 14th

08:30 - 09:00 Registration

09:00 - 09:15 Introduction Clive, Steve, Neil

09:15 - 09:45 What we actually know about developing software, and why

we believe it's true Steve C, Secondary Steve M

09:45 - 12:30 The Unix shell Steve M, Secondary Mike

12:30 - 13:30 Lunch

13:30 - 15:30 Version control Chris, Secondary Mike

15:30 - 17:00 The basics of Python Mike, Secondary Steve M

17:00 19:00 Gathering at the Northern Stage

 

Tuesday May 15

09:00 - 12:30 Continuing Python programming

- Functions - Steve C, Secondary Steve M

- Testing - Mike, Secondary Steve C

12:30 - 13:30 Lunch

13:30 - 15:30 Using relational databases Steve C, Secondary Steve M

15:30 - 17:00 Putting it all together Mike, Secondary Steve C

SSI supporting

SSI leading

Page 25: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Why a boot camp?• Recent research: students learn best in

blended environment, combining:

• Directed in-person learning

• Self-directed online learning

• Researchers are busy people! 2-3 days ok

• Hard to stay motivated learning in isolation

• Boot camps create peer support communities

• In selected disciplines or geographic regions

Page 26: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

What we think we know…

Why get involved?• Critical realisation that scientists need to write

better software

• Reproducibility in data, software next in line?

• Be at the forefront

• Great way to get teaching experience

• Learn new techniques

• Networking, trips to nice places!

• Be a local expert

• Looks good on a CV

Page 27: Steve Crouch, Greg Wilson Software Carpentry Why should scientists understand how to write better software? This work is licensed under the Creative Commons

Want to get involved?

• Write online lectures and practicals

• Want to be a helper…• …or instructor?• Let us know!

• Get in touch!• [email protected][email protected]