technical coordination in open source software development james d. herbsleb school of computer...
TRANSCRIPT
Technical Coordination in Open Source
Software DevelopmentJames D. Herbsleb
School of Computer ScienceCarnegie Mellon University
+1 412 [email protected]
http://www-2.cs.cmu.edu/~jdh/
2
Meanings of “Open Source”
Legal and pragmatic arrangements to ensure availability of source code
Development process− Open community− Widely dispersed contributors− Emulated in many other contexts− Seems relatively free of coordination problems
3
Multi-site Delay
0
10
20
30multi site single site
12.7
4.9
Network Element A
18.1
6.9
Network Element B
WorkDays
Last Modification - First ModificationAll changes July 1997 to July,1999
Modification Request (MR) interval
4
Coordination
“Managing dependencies between activities” (Malone & Crowston, 1994)
Are many kinds of dependencies, we are focusing on depedencies among engineering decisions, e.g.,− Selecting or designing algorithm− Selecting or designing data structure− Selecting object model− Writing the termination condition for a loop− Etc., etc.
Coordination issues are pervasive in software engineering
5
Assumptions
Decisions are a reasonable unit of “progress” in software project
Decision-making consumes resources (calendar time and effort)
Decisions tend to be highly mutually constraining
The “coordination problem” is avoiding constraint violation
Constraint violation produces defects
Application Domain
Software
SoftwareDesign Domain
Functionality
DevelopmentWork
Application Domain
SoftwareDesign Domain
Software
Functionality
Decisions
ConstraintsDevelopmentWork
Application Domain
Software DesignDomain
Software
Functionality
time
people Decisions
Constraints
Application Domain
Software Domain
Software
Functionality
time
people
Coordination
10
Number of people
involvedin decision
+Density of
interdependenceamong decisions
+
Automatic constraint
enforcementEffectiveness of communication
among decision-makers
Visibility of constraints
among decisionsReduced
productivity+
Increasedcycle time
+
Coordination breakdowns: Violations of
mutual constraintsamong engineering
decisions
Defects (when violations are
not discovered and fixed)
+
+Rework (when violations are
discovered and fixed)
“Macro” Theory of Coordination
11
Number of people
involvedin decision
+Density of
interdependenceamong decisions
+
Automatic constraint
enforcementEffectiveness of communication
among decision-makers
Visibility of constraints
among decisionsReduced
productivity+
Increasedcycle time
+
Coordination breakdowns: Violations of
mutual constraintsamong engineering
decisions
Defects (when violations are
not discovered and fixed)
+
+Rework (when violations are
discovered and fixed)
Coordination and Open Source
Development work done by users− Eliminates enormous communication problem− Constraints imposed by implicit requirements are apparent− Simpler, smaller product
12
Number of people
involvedin decision
+Density of
interdependenceamong decisions
+
Automatic constraint
enforcementEffectiveness of communication
among decision-makers
Visibility of constraints
among decisionsReduced
productivity+
Increasedcycle time
+
Coordination breakdowns: Violations of
mutual constraintsamong engineering
decisions
Defects (when violations are
not discovered and fixed)
+
+Rework (when violations are
discovered and fixed)
Coordination and Open Source
Management structure− Minimal -- core group with commit privileges− Decision-making by those with demonstrated technical merit− Participation mirrors dependencies
13
Number of people
involvedin decision
+Density of
interdependenceamong decisions
+
Automatic constraint
enforcementEffectiveness of communication
among decision-makers
Visibility of constraints
among decisionsReduced
productivity+
Increasedcycle time
+
Coordination breakdowns: Violations of
mutual constraintsamong engineering
decisions
Defects (when violations are
not discovered and fixed)
+
+Rework (when violations are
discovered and fixed)
Coordination and Open Source
Open, archived technical discussions− Draws on very large pool of potential experts− Newbies can catch up with minimal distractions to existing staff− Preserves design rationale
14
Technical Coordination Modeled as CSP
Constraint satisfaction problem− a project is a large set of mutually-constraining
decisions, which are represented as− n variables x1, x2, . . . , xn whose − values are taken from finite, discrete domains
D1, D2, . . . , Dn − constraints pk(xk1, xk2, . . . , xkn) are predicates
defined on− the Cartesian product Dk1 x DK2 x . . . x Dkj.
Solving CSP is equivalent to finding an assignment for all variables that satisfy all constraintsFormulation of CSP taken from Yokoo and Ishida, Search Algorithms for Agents, in G. Weiss
(Ed.) Multiagent Systems, Cambridge, MA: MIT Press, 1999.
15
Modeling Coordination -- “Micro” Theory
Create models to explore influence of− Ways of assigning decisions to agents− Communication practices and tools− Properties of constraint network, e.g.,
• Average distance between nodes• Clustering
Open source process is a region in this “model space”
Empirically test hypotheses