so, you are saying that our software quality was screwed up by ... the release engineer?! (keynote @...
Post on 07-May-2015
1.950 Views
Preview:
DESCRIPTION
TRANSCRIPT
MC IS
So, you are saying that our software quality was screwed up by ... the
release engineer?!
Bram AdamsMC IS
Who is Bram Adams?
Wolfgang De MeuterVrije Universiteit Brussel
Herman TrompGhent University
Ahmed E. HassanQueen's University
MC IS
(Lab on Maintenance, Construction and Intelligence
of Software)
MC ISJojo
MiniParastouYou?
Tejinder Dhaliwal Foutse Khomh
Ying Zou
Mika Mäntylä
Emelie Engström Kai Petersen
Ahmed E. Hassan
joint work with
Daniel German
MC IS
So, you are saying that our software quality was screwed up by ... the
release engineer?!
Bram AdamsMC IS
MC IS
MC IS
On average we deploy new code fifty times
a day.
Continuous Delivery
http://goo.gl/qPT6
VCS
continuousintegration
9 min.
15k tests
test
staging/production
8
Continuous Delivery
http://goo.gl/qPT6
VCS
continuousintegration
9 min.
15k tests
test
staging/production
8
Continuous Delivery
http://goo.gl/qPT6
VCS
continuousintegration
9 min.
15k tests
6 min.
test
staging/production
8
IMVU, eh?BENEVOL-ers
Work fast and don’t be afraid to break
things.
http://goo.gl/UlCW
James Whittaker
Build a little and then test it.
Build some more and test some more.
Even Desktop Apps Release More Frequently
2002 2003 2004 2005 2006 2007 2008 2009 2010 20110
8
16
24
32
40
#releases
?
12 http
://en
.wik
iped
ia.o
rg/w
iki/H
isto
ry_o
f_Fi
refo
x
http
://ha
cks.
moz
illa.
org/
wp-
cont
ent/
uplo
ads/
2012
/05/
rapi
d-re
leas
e.jp
g
... although it's Kind of an Illusion!
reduce cycle time!
Rapid Release in a Nutshell
packaging & delivery
in-house/3rd partydevelopment
integrate
buildtest
http://behrns.files.wordpress.com/2008/03/ikea-car.jpg
14
Are Rapid Releases Worth the Effort?BENEVOL-ers
=? Certificate
of High Quality
rapid release
rapid release=?OR
1.0 1.5 2.0 3.0 3.5 3.6 4.0 5.0 6.0
7.0 8.0
9.0
Traditional Release Cycle Rapid Release Cycle
Do Faster Releases Improve Software Quality? - An Empirical Case Study of Mozilla Firefox (MSR '12, Khomh et al.)
1.0 1.5 2.0 3.0 3.5 3.6 4.0 5.0 6.0
7.0 8.0
9.0
Traditional Release Cycle Rapid Release Cycle
Do Faster Releases Improve Software Quality? - An Empirical Case Study of Mozilla Firefox (MSR '12, Khomh et al.)
Socorro crash reports
Bugzilla bug reports
1.0 1.5 2.0 3.0 3.5 3.6 4.0 5.0 6.0
7.0 8.0
9.0
Traditional Release Cycle Rapid Release Cycle
release&model
vs.
Same # Post-release Bugs
13 20
1.6 1.7
0
1
2
3
4
5
Traditional Release (TR) Rapid Release (RR)
Num
ber o
f Pos
t Rel
ease
Bug
s Per
Day
Same # Post-release Bugs
We have about 30 of these project branches active at any one time. That means as a company we can have
30 or more experiments safely running in parallel....I’m guessing that most of the bugs end up being integration issues [of these branches into
the main trunk] otherwise the project branch wouldn’t have merged back in with mainstream devel.
643
370
0
120
240
360
480
600
720
840
960
1080
1200
Traditional Release (TR) Rapid Release (RR)
Med
ian
UpT
ime
in S
econ
ds
Crashes Pop Up Earlier
Firefox 4 was a crisis time for Firefox and Mozilla, since it took so long to complete and was a painful
engineering effort. We decided to make our development more agile by switching to rapid
release at the same time as we were making a bunch of other changes [...], and
starting Firefox OS development
vs.bug&fixing
1.0 1.5 2.0 3.0 3.5 3.6 4.0 5.0 6.0
7.0 8.0
9.0
Traditional Release Cycle Rapid Release Cycle
release&model
654 159
16
6
0
10
20
30
40
50
60
70
80
90
100
Traditional Release (TR) Rapid Release (RR)
Bug
Age
in D
ays
Bugs are Fixed Faster
59%
37%
67%
43%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
TraditionalRelease (TR) -
Main
Rapid Release(RR) - Main
TraditionalRelease (TR) -
Beta
Rapid Release(RR) - Beta
% o
f Bug
s Fix
ed
Proportionally Less Bugs Fixed
I’m not surprised that fewer bugs are fixed: some defects are trivial and some will take weeks of careful code inspection to find and fix. All those bugs would
have fallen in the same bucket for a 12+ month release cycle, but now the harder ones get
pushed out to later releases
can be better interpreted that we are being less effective at triaging bugs with rapid release, or that we have more beta users and so the incoming
bug rates are larger
same #bugs
... but faster crashes
bugs are fixed faster
... but less bugs are
being fixed
Shall we Blame the Testers?BENEVOL-ers
packaging & delivery
in-house/3rd partydevelopment
http://behrns.files.wordpress.com/2008/03/ikea-car.jpg
33
integrate
buildtestreduce cycle time!
Rapid Release in a Nutshell
2.0 3.0 3.5 3.6 4.0
'07 '08 '09 '10 '11 '12'06
5.0-13.0
Did the System Testing Process Change?
Rapid Release Model vs. Software Testing Effort – Study of Mozilla Firefox Project (ICSM '13, Mäntylä et al.)
Different Test Phases
integration + build
release
(selection of) unit tests
all unit tests
integration tests
user acceptance/system tests
test phase time to run (hours)
capacity tests
smoke tests
testcase id #8410 - all tab options unchecked
Testcase ID #: 8410
Summary: All Tab options unchecked
Product: Firefox
Branch: 3.6 Branch
Regression Bug ID #: None specified
Enabled?
Community Enabled?
Steps to Perform:1. Open Tool | Options (Preferences) | Tabs. Make sure all check
boxes are unchecked.
Expected Results:Confirm that all four preferences do not work, since they are disabled:
A new window is opened instead of a new tab.You will not be warned when closing multiple tabs.You will not be warned when opening multiple tabs (>15 at once,e.g. the default RSS folder in the toolbar).The tab bar is not shown when only one tab is open.Opening a link in a new tab will open it in the background.
Testgroup(s) Subgroup(s)
3.6 Full Functional Tests (FFTs) (137) FX 3.6 FFT - Tabbed Browsing (1242)
recent test results for testcase id #8410 - all tab options uncheckedsubmission date result id# testcase id#: summary product platform branch locale
2010-11-02 02:16:00 336691 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-07-07 10:07:17 279135 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-05-12 11:20:12 257854 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-17 02:21:29 232877 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-11 20:56:35 230165 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-07 10:50:21 228669 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2009-10-22 03:22:31 213485 8410: All Tab options unchecked Firefox 3.6 Branch en-US
search test results: recent failures | recently unclear | most common failures | testcases most frequently marked as unclear | testgroup popularity |results with comments
search testcases: testcases needed! | recently added | recently changed | by id:
welcomelog in
get testing!run testsview/search tests
reportingsearch resultsadvanced searchtestdaystest runsstatistics
helplitmus tutoriallitmus wiki
Litmus
Test Scenario & Expected Outcome
Recorded Test
Results
testcase id #8410 - all tab options unchecked
Testcase ID #: 8410
Summary: All Tab options unchecked
Product: Firefox
Branch: 3.6 Branch
Regression Bug ID #: None specified
Enabled?
Community Enabled?
Steps to Perform:1. Open Tool | Options (Preferences) | Tabs. Make sure all check
boxes are unchecked.
Expected Results:Confirm that all four preferences do not work, since they are disabled:
A new window is opened instead of a new tab.You will not be warned when closing multiple tabs.You will not be warned when opening multiple tabs (>15 at once,e.g. the default RSS folder in the toolbar).The tab bar is not shown when only one tab is open.Opening a link in a new tab will open it in the background.
Testgroup(s) Subgroup(s)
3.6 Full Functional Tests (FFTs) (137) FX 3.6 FFT - Tabbed Browsing (1242)
recent test results for testcase id #8410 - all tab options uncheckedsubmission date result id# testcase id#: summary product platform branch locale
2010-11-02 02:16:00 336691 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-07-07 10:07:17 279135 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-05-12 11:20:12 257854 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-17 02:21:29 232877 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-11 20:56:35 230165 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2010-01-07 10:50:21 228669 8410: All Tab options unchecked Firefox 3.6 Branch en-US
2009-10-22 03:22:31 213485 8410: All Tab options unchecked Firefox 3.6 Branch en-US
search test results: recent failures | recently unclear | most common failures | testcases most frequently marked as unclear | testgroup popularity |results with comments
search testcases: testcases needed! | recently added | recently changed | by id:
welcomelog in
get testing!run testsview/search tests
reportingsearch resultsadvanced searchtestdaystest runsstatistics
helplitmus tutoriallitmus wiki
06/2006 - 06/2012
312,502 test executions
1,547 test cases
6,058 different testers
2,009 tested
builds
10% of test cases are
automated
147 releases
Litmus
35,000 - 55,000 executions/release
Testing More Intense
6,000 - 9,000 executions/release
... but twice as many
test executions per day
To survive under the time constraints of a rapid release we’ve had to cut the fat and
focus on those test areas which are prone to failure, less on ensuring legacy support
700 - 1,100 different test cases
100 - 170 different test cases
Smaller Scope of Testing
a fixed set of tests for areas prone to regression (Flash plugin testing for example)
anda dynamically changing set of tests to cover recent regressions we chemspilled for and high risk features
1,000 - 1,900 testers
8 - 34 testers
Smaller Testing Team
The weakest point [in rapid release] is that it’s harder to develop a large community which more
accurately represents the scale and variability of the general population. Frequently this means that we don’t hear about issues until after release, in effect turning our release early adopters into beta testers
1) the core team has remained largely unchanged since adopting rapid release 2) the contract team has nearly
doubled ... We can scale up our team much faster through contractors than through
hiring. The time afforded to us to make the switch to rapid releases left little room for failure which is why we
took that approach.
Test Priorities
less locales being tested ...
Traditional Rapid
0.0
0.5
1.0
1.5
2.0
Loca
les
per d
ay
Traditional Rapid
02
46
8
OSs
per
day
... but more operating systems
we now distribute across Betas. For example, we might test Windows 7, OSX 10.8, and Ubuntu in one Beta then
Windows XP, Mac OSX 10.7, and Ubuntu in another Beta
In Other Words ...
Testing has to become more
Focused
The greatest strength [of rapid releases] is that the scope of what needs to be tested is narrow so we can focus all of our energy on deep diving into a few areas
Release Engineering
deployment
in-house/3rd partydevelopment
http://behrns.files.wordpress.com/2008/03/ikea-car.jpg
49
integrate
buildtestreduced cycle time!
Will My Patch Make It?
(And How Fast?)
50
MSR '13, Jiang et al.
I do hold out hope that Google does come around and works to fix their codebase to get it merged upstream to stop the huge blockage that they have now caused in a large number of embedded Linux hardware companies […] But I need the help of the Google developers to make it happen, without them, nothing can change.
http://www.kroah.com/log/linux/android-kernel-problems.html51
GregKroah-Hartman
Linux 3.5
The Linux Integration Process
52
linux-usb
linux-scsi
lkml
subsystemmaintainer 1
subsystemmaintainer 2
Reviewing Integration Staging
maintainer Linus Torvalds
contributor 1
contributor 2
contributor 3
Research Questions
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
subsystemmaintainer 1
subsystemmaintainer 2
Linux 3.5
Case Study Setup
54
Reviewing Integration Staging
maintainer Linus Torvalds
linux-usb
linux-scsi
lkml
contributor 1
contributor 2
contributor 3
Linux 3.5
Case Study Setup
54
Reviewing Integration Staging
Linus Torvalds
linux-usb
linux-scsi
lkml
contributor 1
contributor 2
contributor 3
Case Study Setup
54
Linus Torvalds
linux-usb
linux-scsi
lkml
Case Study Setup
54
Linus Torvalds
linux-usb
linux-scsi
lkml
Case Study Setup
54
Linus Torvalds
email1
email3
email2 email patch2
email patch1
email patch3...
Case Study Setup
54
...commit3
commit2
commit patch1
commit1
commit patch2
commit patch3
email1
email3
email2 email patch2
email patch1
email patch3...
checksum1
checksum3
checksum2
...
Case Study Setup
54
...commit3
commit2
commit patch1
commit1
commit patch2
commit patch3
email1
email3
email2 email patch2
email patch1
email patch3...
checksum1
checksum3
checksum2
...
Case Study Setup
54
...commit3
commit2
commit patch1
commit1
commit patch2
commit patch3
email1
email3
email2 email patch2
email patch1
email patch3...
Experience
Review ExperiencePatch
Experience
Commit
55
5 Dimensions(29 Patch Metrics)
size: LOC > 50
Number of reviewers > 3 ?
not accepted Number of review messages > 3 ?
Is this first patch in thread?
not acceptedaccepted
Decision Tree
Building Decision Trees
56
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% of Patches Make it ...
592005 2006 2007 2008 2009 2010 2011 2012
accepted/rejected patches
year
perc
enta
ge o
f pat
ches
0
20000
40000
60000
80000
100000
120000
28.6328.7
27.03
32.83 32.79 33.8733.55 30.74
71.37
71.3
72.97
67.1767.21 66.13
66.45
69.26
% accepted by linus% rejected by linus
# o
f p
atch
es
72.97%
67.17%
71.3%
71.73%
69.26%
66.45%
66.13%67.21%
28.27%28.7%
32.79%32.83%
27.03%
30.74%33.55%
33.87%
ACCEPT
REJECT
... and it takes 1~6 Months!
60
2005 2006 2007 2008 2009 2010 2011 2012
year
perc
enta
ge o
f acc
epte
d pa
tche
s of
eac
h ye
ar
020
4060
80
instantlywithin_hourwithin_day
within_weekwithin_monthwithin_quarter
within_half_yearwithin_yeartook_ages
Text
% acceptedpatches
2005 2006 2007 2008 2009 2010 2011 2012
year
perc
enta
ge o
f acc
epte
d pa
tche
s of
eac
h ye
ar
010
2030
40
Reviewing Speeds Up ...re
view
ing
time
(#da
ys)
2005 2006 2007 2008 2009 2010 2011 2012
year
perc
enta
ge o
f acc
epte
d pa
tche
s of
eac
h ye
ar
010
2030
4050
6070
instantlywithin_hourwithin_day
within_weekwithin_monthwithin_quarter
within_half_yearwithin_yeartook_ages
2005 2006 2007 2008 2009 2010 2011 2012
year
perc
enta
ge o
f acc
epte
d pa
tche
s of
eac
h ye
ar
010
2030
4050
6070
instantlywithin_hourwithin_day
within_weekwithin_monthwithin_quarter
within_half_yearwithin_yeartook_ages
... while Integration Slows Down
inte
grat
ion
time
(#da
ys)
2005 2006 2007 2008 2009 2010 2011 2012
year
perc
enta
ge o
f acc
epte
d pa
tche
s of
eac
h ye
ar
010
2030
4050
6070
instantlywithin_hourwithin_day
within_weekwithin_monthwithin_quarter
within_half_yearwithin_yeartook_ages
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
Experience Matters Most for Patch Acceptance
precision:73% recall:68.47%
65
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
by experienced
developers
mature patch
specific subsystem
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
by experienced
developers
mature patch
specific subsystem
Experience, Reviewers and Patch Spread Matter for Reviewing Time
68
How manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
by experienced
developers
mature patch
specific subsystem
less spread out
by more experienced developers
Release Engineering
70
Software Quality
?
Rapid Release vs. Quality?
Certificateof
High Quality
Trunk- vs. Branch-based Development
72
vs.
branch-based integration
trunk-based integration
"The Evolution of the Linux Build System" (Adams et al.)
73
Linux 2.6.16.18
0.01 0.11 1.0
1.2 2.0 2.2
2.4 2.6.0 2.6.21.5
How to Evolve Build Systems in a Healthy Way?
How to Focus Testing?
How to Keep up with 3rd Party Dependencies?
75 3rd party branch
Where are the Release Engineering Tools?
76
RELEASE IDE
test your build!
focus your testing efforts
what did the other team break
now ;-)
keep poking those upstream guys until they
give in :-)
How to Align Release Engineers with the Rest of the Organization?
On average we deploy new code fifty times
a day.
MC IS
same #bugs
... but faster crashes
bugs are fixed faster
... but less bugs are
being fixed
Testing has to become more
FocusedHow manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
by experienced
developers
mature patch
specific subsystem
less spread out
by more experienced developers
RELENG 2013
1st International Workshop on Release Engineering
http://releng.polymtl.ca May 20, 2013, San Francisco, USA
84 participants
... and 10
academic talks
2 keynotes
6 practitionertalks
On average we deploy new code fifty times
a day.
MC IS
Testing has to become more
FocusedHow manypatches are
merged?
What kind of patch is merged
more likely?
What kind of patch is accepted
faster?
33% patch
acceptance
integration becomes slower
by experienced
developers
mature patch
specific subsystem
less spread out
by more experienced developers
same #bugs
... but faster crashes
bugs are fixed faster
... but less bugs are
being fixed
top related