does distributed development affect software quality???? an empirical case study of windows vista...

17
Does Distributed Development Affect Software Quality???? An Empirical Case Study of Windows Vista Christian Bird, Premkumar Devanbu, Harald Gall , Brendan Murphy In ICSE ’09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering

Upload: polly-evans

Post on 27-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Does Distributed Development Affect Software Quality????

An Empirical Case Study of Windows Vista

Christian Bird, Premkumar Devanbu, Harald Gall ,  Brendan Murphy

In ICSE ’09: Proceedings of the 2009 IEEE 31st International Conference on Software Engineering

Distributed Software development is more riskier and challenging than collocated development

Reasons for distributed software development: skill set availability, acquisitions, government restrictions, increased code size, cost and complexity

Challenges faced : delayed feedback, restricted communication, less shared project awareness, difficulty of synchronous communication, inconsistent development and build environments

Post – Release Failure: The inability of a system or a component to perform its required functions within specified performance requirements

Individual executable and libraries are referred to as Binaries

Key Questions

Who or What is distributed at What level?? Are people or the artifacts distributed?? Are people dispersed individually or dispersed in groups?? Way the developers and other entities are distributed?? Distribution can be across geographical, organizational, temporal, or

stakeholder boundaries

Study involves Distributed development at multiple levels of separation Large Scale S/W development – thousands of binaries and developers Complexity and maintenance characteristics of the distributed and

collocated binaries All sites involved in the study are part of the same company

Difficulties in Distributed DevelopmentCommunicationCoordination BreakdownsDiversity in Operating EnvironmentsDistance reduces team cohesionOrganizational and National Cultural barriers

Testable Hypothesis Binaries that are developed by teams of engineers that are distributed will have more post-release failures than developed by collocated engineersBinaries that are distributed will be less complex and have fewer dependencies

Popular Beliefs

Effects on Bug Resolution

The time to Resolution of Modification Requests (MRs) For a single site – 5 days

For Distributed site – 12.7 days On controlling factors like : No: of People working on MR,

severity, size of the change, negative effect of distributed development was less significant

Distributed development indirectly introduces delay due to correlated factors such as team size and breadth of changes required

Feasible decisions must for the project Testable hypotheses about productivity

People who are assigned work from many sources have lower productivity MRs that require work in multiple modules take more time

Effects on Quality and Productivity

Relationship between Dispersion, development productivity and conformance quality

Inference: Projects that had more dispersion had lower levels of productivity and conformance quality

Key Actions necessary for success with global development:

- Distribute entire things for entire life cycle

- Plan to accommodate time and distance Reduce intensive collaborations Reduce national and organizational cultural distance Reduce temporal distance

Methods and Analysis

Data Collection

Windows Vista 3300 binaries Tens of MLOC 59 buildings 21 campuses in Asia, Europe, and North

America

Data Collection Focus Code Quality Geographical location

Geographical Location based Separation

Hierarchy Building Cafeteria Campus Locality Continent World

Assignment to level of hierarchy To lowest possible level which covers a threshold

percentage of commits

Threshold percentage 75% of commits made

Geographical Location based Separation cont’d

Experiments and Results

Objective Test the hypothesis that, there will be difference in

code quality between distributed and collocated binary

Measure of Quality: Number of post release failures per binary

Results cont’d

Result1: Increase of failures for geographic distribution is small (but not statistically insignificant)

Result2: Effect of geographic separation can be controlled to some extent by controlling team size

Linear Regression Analysis to examine the effect of distributed development on the number of failures: 9.2% increase in failures when distributed 4.6% increase in failures when distributed but team-size

is controlled

Analysis of Results

Arrived Conclusion“In the context in which Windows Vista was developed, teams that

were distributed wrote code that had virtually same number of post-release failures as those that were collocated”

Factors that may be responsible Difference between collocated and distributed binary? Difference may come from

Size and complexity, Code Churn, Test Coverage, Dependencies, People

Finding: No significant difference except a small (statistically not significant) difference in team size metric.

How Is It Possible To Conduct Distributed How Is It Possible To Conduct Distributed Development Without Hampering Quality ?Development Without Hampering Quality ?

Relationship between sites

All sites work together, same pay , benefits

Cultural Barriers

Engineers visit each other, work together, builds trust

Communication

Maintain core working hours

Consistent Use Of Tools

same source code management, documentation, defect tracking tools

Continued

Distributed development can work for large software projects

Organizationally compact geographically distributed project is better than a geographically local organizationally distributed project

Microsoft Vista is an example which negates the popular belief about distributed development

Conclusion

Thank You !

Questions !