1 experience from studies of software maintenance and evolution parastoo mohagheghi post doc,...

1

Experience from Studies of Software Maintenance and Evolution

Parastoo Mohagheghi

Post doc, NTNU-IDI

SEVO Seminar, 16 March 2006

2

How do we observe evolution?

• Dynamics of software evolution:– Growth of the size of a

system (Belady and Lehman), growth of structural attributes of software such as complexity, directory structure or dependencies.

– Release interval.

• Categories of maintenance and evolution activities: – Number and

percentage of modules changed, records of change for modification type and effort (corrective, enhancement) or problem reports.

3

Data to measure the changes

• Dynamics of software evolution:– Data from configuration

management systems( CSV, ClearCase) or other tools.

– Automatically collected and reliable.

– E.g., study of open source systems.

• Categories of evolution activities:– Data from change logs,

requirement management systems, change requests, effort reporting systems or problems.

– manually filled and less reliable.

– e.g., change requests in Ericsson.

4

Other characteristics of data

• Dynamics of software evolution:– Data for several releases,

coarse (size, directory structure) or fine (source files).

– Comparable across systems.

– Similar tools are used such as CVS or ClearCase.

• Categories of evolution activities:– Fine granular data: change

requests, problem reports or change logs. Not as many releases are needed, but several data points.

– Difficult to compare due to the diversity of concepts and procedures, subjective data.

– Tools do not integrate or are not effective.

5

Empirical studies at Ericsson - Trouble Reports (TRs)• A TR is a report for failures observed in system

testing and field use. It may be related to hardware or software (code, requirements, documentation etc.). After removing duplicates, we studied faults (causes of problems).

• TRs were stored as plain text files. We developed programs to store data from 13 007 TRs in a SQL database.

• Inconsistencies in fields and missing data. E.g., 44% did not have Type of problem filled.

6

Empirical studies at Ericsson - Change Requests (CRs)• Two kinds of changes in release-based development:

Changes between releases and changes during development of a release.

• CRs handled changes initiated during a release to add/modify/remove a requirement or improve an implementation.

• 169 CRs in word files that were analyzed. Data on estimated effort was useless and the impact on components was given too coarse (on subsystem level).

7

What did we observe?

1 - 9 - 217 10 408 206 11 000

50 61

2 602 414 109 226 37 452 230 30 000

51 59

3 1953 1519 1063 414 118 480 240 55 000

50 61

Re

lea

se

No

. of T

Rs

No

. of T

Rs

sub

syst

em

No

. of T

Rs

blo

ck

No

. of T

Rs

OP

6

No

. of C

Rs

Siz

e in

KLO

C

Mo

difi

ed

KL

OC

Pe

rso

n-H

our

s T

est

%M

odi

fica

tion

%R

euse

8

What we could not measure

• Changes to requirements between releases to have a holistic view of evolution. ReqPro for requirements, but not consistent data.

• Effort to implement changes or correct defects. Estimated effort in the reports, not reliable data.

• Impact of requirements, change requests or trouble reports on software components. No traceability to code. Dependent on descriptions in files and not reliable.

9

Examples of missing dataPercentages of missing data

System Id

Severity Location Type

S1 0 0 0

S2 4.4 25.1 2.5

S3 20.0 20.0 8.6* (4.3)

S4 0 0 9.0* (8.4)

S5 0** 22 for large subsystems,

46 for smaller

blocks inside subsystems

**

44 for 12 releases in the

dataset

Severity or Priority: 90% in S1, 57% in S2, 72% in S3, 57% in S4,57% in S5had medium severity.

10

Comparing to other studies or generalization• We found 40% of change requests being an

improvement of a quality attribute and 21% changes to functionality (perfective), mostly initiated by developers. But these definitions vary in studies.

• Defects or problems in other studies may be related to only software, detected in different phases or one may be counting each correction to a module as a fault (as in Mira’s paper from ICSM 1999).

11

What we have learned on metrics definition• Concepts: be careful to define yours relative to

others. Paper on problem reports to ICSE’06 WoSQ: What, Where and When.– What: Problem appearance, its cause or human encounter

with the system.– Where: Software (executable or non-executable) or System.– When: Detected in unit testing, system testing, field use or

all.

12

What we have learned on data collection

• Improve or develop tools and routines:– Don’t close a change request or trouble report

without sufficient data.– Store data in a SQL database or a tool with search

facilities.– Avoid inconsistencies between releases or be

aware of them.– Study how to integrate data from several sources.

13

What we have learned on analysis• Report mean, median, StD, confidence intervals and

p-values if inferential statistics applied. All methods for collecting evidence (for example meta-analysis or combining p-values) need proper statistics reported.

• Report inconsistencies, missing data and other threats to validity. A study of papers on reuse shows that only 4/12 have discussed validity at all.

14

Conclusions

• Different types of data and concerns for 1) studying dynamics of software evolution or 2) categories of software evolution.

• Why we need both? From RELEASE:• Retrospective study: can we reconstruct how and why the

software evolved to where it is now? 2)• Predictive study: Can we predict how a system will evolve

given its current state? 1) and 2), evolution laws• Curative activity: can a technique support a given

evolutionary process? 1) and 2)• Extend this to a paper.

1 experience from studies of software maintenance and evolution parastoo mohagheghi post doc,...

Documents