process mining isec-2014.pdf

10
Nirikshan: Mining Bug Report History for Discovering Process Maps, Inefficiencies and Inconsistencies Monika Gupta Ashish Sureka Indraprastha Institute of Information Technology, Delhi (IIITD) New Delhi, India {monikag, ashish}@iiitd.ac.in ABSTRACT Issue tracking system such as Bugzilla, Mantis and JIRA are Process Aware Information Systems to support business process of issue (defect and feature enhancement) reporting and resolution. The process of issue reporting to resolution consists of several steps or activities performed by various roles (bug reporter, bug triager, bug fixer, developers, and quality assurance manager) within the software maintenance team. Project teams define a workflow or a business process (design time process model and guidelines) to streamline and structure the issue management activities. However, the runtime process (reality) may not conform to the design time model and can have imperfections or inefficiencies. We apply business process mining tools and techniques to ana- lyze the event log data (bug report history) generated by an issue tracking system with the objective of discovering run- time process maps, inefficiencies and inconsistencies. We conduct a case-study on data extracted from Bugzilla issue tracking system of the popular open-source Firefox browser project. We present and implement a process mining frame- work, Nirikshan, consisting of various steps: data extrac- tion, data transformation, process discovery, performance analysis and conformance checking. We conduct a series of process mining experiments to study self-loops, back-and- forth, issue reopen, unique traces, event frequency, activity frequency, bottlenecks and present an algorithm and metrics to compute the degree of conformance between the design time and the runtime process. Categories and Subject Descriptors H.4 [Information Systems Applications]: Miscellaneous; D.2.8 [Software Engineering]: Metrics—complexity mea- sures, performance measures General Terms Algorithms, Experimentation, Measurement Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ISEC ’2014 Chennai, Tamil Nadu India Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. Keywords Mining Software Repositories, Process Mining, Software Main- tenance, Issue Tracking System, Open-Source Software, Em- pirical Software Engineering and Measurements 1. RESEARCH MOTIVATIONAND AIM Issue Tracking Systems (ITS) such as Bugzilla 1 , Mantis 2 and JIRA 3 are Process Aware Information Systems (PAIS) used during software development and maintenance for is- sue (feature enhancement requests and defects) tracking and management. Large and complex software projects, both open-source and commercial, define a bug lifecycle with dif- ferent states and transitions for a bug report and a pro- cess model (activities, events, actors, transitions and work- flow) serving as guidelines to the project team. ITS such as Bugzilla logs events generated during a bug lifecycle (such as who reported the bug and when, change of resolution and status field of a bug, developer and component assignment or reassignment) and refers to the event logs as Bug History. Process Mining (intersection of Business Process Manage- ment and Data Mining) consists of mining event logs gener- ated from business process execution supported by a PAIS [10]. The focus of the research presented in this paper is on the application of process mining techniques to mine bug history data and event logs generated from an ITS (software repository). Issue tracking and management is an activity which is integral to software development and maintenance and is fraught with several challenges. The work presented in this paper is motivated by the need to mine process data generated by ITS to extract insights particularly inefficien- cies and inconsistencies. This can be useful to practition- ers for solving problems resulting in productivity and effi- ciency improvements in software maintenance. The specific research aims of the work presented in this paper are the following: 1. To develop a generic algorithm for quantitatively mea- suring the compliance (conformance checking and verifica- tion) between the design time and the runtime (reality) pro- cess model using metrics in the context of issue tracking and management. 2. To investigate the application of process mining plat- forms such as Disco 4 and state-of-the-art algorithms for dis- covering process maps from event logs (in our case, bug re- 1 https://bugzilla.mozilla.org/ 2 http://www.mantisbt.org/ 3 https://www.atlassian.com/software/jira 4 http://fluxicon.com/disco/

Upload: kamanw

Post on 27-Sep-2015

26 views

Category:

Documents


0 download

TRANSCRIPT

  • Nirikshan: Mining Bug Report History for DiscoveringProcess Maps, Inefficiencies and Inconsistencies

    Monika Gupta Ashish Sureka

    Indraprastha Institute of Information Technology, Delhi (IIITD)New Delhi, India

    {monikag, ashish}@iiitd.ac.in

    ABSTRACTIssue tracking system such as Bugzilla, Mantis and JIRAare Process Aware Information Systems to support businessprocess of issue (defect and feature enhancement) reportingand resolution. The process of issue reporting to resolutionconsists of several steps or activities performed by variousroles (bug reporter, bug triager, bug fixer, developers, andquality assurance manager) within the software maintenanceteam. Project teams define a workflow or a business process(design time process model and guidelines) to streamlineand structure the issue management activities. However,the runtime process (reality) may not conform to the designtime model and can have imperfections or inefficiencies. Weapply business process mining tools and techniques to ana-lyze the event log data (bug report history) generated by anissue tracking system with the objective of discovering run-time process maps, inefficiencies and inconsistencies. Weconduct a case-study on data extracted from Bugzilla issuetracking system of the popular open-source Firefox browserproject. We present and implement a process mining frame-work, Nirikshan, consisting of various steps: data extrac-tion, data transformation, process discovery, performanceanalysis and conformance checking. We conduct a series ofprocess mining experiments to study self-loops, back-and-forth, issue reopen, unique traces, event frequency, activityfrequency, bottlenecks and present an algorithm and metricsto compute the degree of conformance between the designtime and the runtime process.

    Categories and Subject DescriptorsH.4 [Information Systems Applications]: Miscellaneous;D.2.8 [Software Engineering]: Metricscomplexity mea-sures, performance measures

    General TermsAlgorithms, Experimentation, Measurement

    Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.ISEC 2014 Chennai, Tamil Nadu IndiaCopyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.

    KeywordsMining Software Repositories, Process Mining, Software Main-tenance, Issue Tracking System, Open-Source Software, Em-pirical Software Engineering and Measurements

    1. RESEARCHMOTIVATION AND AIMIssue Tracking Systems (ITS) such as Bugzilla1, Mantis2

    and JIRA3 are Process Aware Information Systems (PAIS)used during software development and maintenance for is-sue (feature enhancement requests and defects) tracking andmanagement. Large and complex software projects, bothopen-source and commercial, define a bug lifecycle with dif-ferent states and transitions for a bug report and a pro-cess model (activities, events, actors, transitions and work-flow) serving as guidelines to the project team. ITS such asBugzilla logs events generated during a bug lifecycle (suchas who reported the bug and when, change of resolution andstatus field of a bug, developer and component assignment orreassignment) and refers to the event logs as Bug History.Process Mining (intersection of Business Process Manage-ment and Data Mining) consists of mining event logs gener-ated from business process execution supported by a PAIS[10]. The focus of the research presented in this paper ison the application of process mining techniques to mine bughistory data and event logs generated from an ITS (softwarerepository). Issue tracking and management is an activitywhich is integral to software development and maintenanceand is fraught with several challenges. The work presentedin this paper is motivated by the need to mine process datagenerated by ITS to extract insights particularly inefficien-cies and inconsistencies. This can be useful to practition-ers for solving problems resulting in productivity and effi-ciency improvements in software maintenance. The specificresearch aims of the work presented in this paper are thefollowing:

    1. To develop a generic algorithm for quantitatively mea-suring the compliance (conformance checking and verifica-tion) between the design time and the runtime (reality) pro-cess model using metrics in the context of issue tracking andmanagement.

    2. To investigate the application of process mining plat-forms such as Disco4 and state-of-the-art algorithms for dis-covering process maps from event logs (in our case, bug re-

    1https://bugzilla.mozilla.org/2http://www.mantisbt.org/3https://www.atlassian.com/software/jira4http://fluxicon.com/disco/

  • port history data) and then analyzing the discovered processmap for understanding the actual runtime process.

    3. To define a process mining framework and conducta case-study on popular open-source projects like Firefoxand Core from Mozilla foundation for analyzing the per-formance of various process-related activities and identify-ing inefficiencies (statistics on events, activities and tran-sitions) and imperfections (reopening of bugs, bottlenecks,self-loops, back-and-forth between activities).

    2. RELATEDWORKANDRESEARCHCON-TRIBUTIONS

    In this section we discuss closely related work (to the re-search presented in this paper) and present the novel re-search contribution of this paper in context to the existingwork. We conduct a literature review in the area of processmining software repositories. Table 1 presents a list of 7closely related works arranged in a chronological order (year2006 to 2012). We synthesize existing work and classify thepapers in terms of attributes such as software repository(Version Control Systems, Defect Tracking Systems, MailArchives), project (ArgoUML, Linux, Proprietary), OSS orCSS and research objective. The most closely related re-search to the work presented in this paper is by Sunindyoet. al. [9] in which they proposed a framework for effectiveengineering process observation in OSS projects. They usedhypothesis testing approach to verify design model with run-time event log from bug history data. They obtained pro-cess map for RHEL bug history data using heuristic miningalgorithm of process mining tool ProM. The conformanceverification is not in-depth and done for few dimensions byproving/disproving the hypothesis. Poncin et. al. [5] com-bined the information from different repositories by match-ing related events. This is achieved by a prototype FRASR(FRamework for Analyzing Software Repositories). The re-sulting log is analyzed using ProM dotted chart visualizationand fuzzy miner plugin for role classification and fuzzy graphgeneration (actual view of bug lifecycle) respectively. How-ever, major emphasis is on organizational perspective andnot on analyzing runtime process.

    From literature, we observe that major research focusis towards designing algorithm for process discovery, defin-ing possible broad perspectives (such as process, cases andorganizational) of event log analysis and its applicability.However, to the best of our knowledge the outcome of pro-cess perspective (discovered process map) for ITS event log(which is very important repository of a software project)is not analyzed from multiple dimensions. We hypothesizethat there is a need to have an in-depth analysis to helpprocess analyst in making informed decisions for the pro-cess improvement and effectively the overall stability of anorganization.

    In context to previous work, the study presented in thispaper addresses the challenges by making following noveland unique research contributions:

    1. To the best of our knowledge, the work presented inthis paper is the first in-depth study on process mining eventlogs (bug life-cycle and history data) from issue trackingsystem (process aware information systems) for discoveringruntime process maps (and statistics on activities, events,transitions, unique traces), inefficiencies (issue reopens, bot-tlenecks, self-loops, back-and-forth between activities) and

    inconsistencies (an algorithm to compute degree of confor-mance between design time and runtime process models).

    2. A case-study and empirical analysis (implementingthe process mining framework and conformance checkingalgorithm) on issue tracking system data of Firefox andCore products (large, complex, long-lived and popular open-source projects) from Mozilla Foundation. We present newresults and fresh perspectives (by mining large real-worlddata) motivating the need and demonstrating the usefulnessof process mining event log data or traces generated fromissue tracking systems.

    3. RESEARCHMETHODOLOGY ANDFRAMEWORK

    We conduct experiments on issue tracking system datafrom Firefox and Core open-source projects. Table 2 dis-plays the experimental dataset details. We chose Firefoxand Core for our analysis as these projects are large, com-plex and long-lived. Firefox is Mozillas flagship softwareproduct which is the third most-used web browser for Win-dows, MAC and Linux. Core includes shared componentsused by Firefox and other Mozilla software. The bug re-port data for Firefox and Core projects are publicly avail-able and hence our results can be replicated and used forcomparison by other researchers. We conduct experimentson two projects (Firefox and Core) to remove one-projectbias in our results and conclusions (increase the generaliz-ability of our results). As shown in Table 2, the size of theexperimental dataset is 12234 issue reports from Firefox and24253 reports from Core (one year period: 1 January 2012to 31 December 2012).

    Attribute ValueFirst issue creation date 1 Jan 2012Last issue creation date 31 Dec 2012Date of extraction 14 July 2013Total issues in 2012 111234Issues not authorized for access 15638Issues without history 3149Total issues extracted for Firefox 12234Total issues extracted for Core 24253Total activities for Firefox 40233(including Reported)Total activities for Core 88396(including Reported)

    Table 2: Experimental dataset details for Core and Firefox(open source Mozilla Project).

    The complete framework (called as Nirikshan which is aHindi word meaning inspection or review), as depicted inFigure 1, has three major stages: 1. data extraction, 2.data transformation, and 3. process mining.

    1. Data extraction: We extract bug report history (re-fer to Figure 2 displaying the snapshot of a Bugzillabug history page) using Bugzilla APIs (through XML-RPC or JSON-RPC interface). The bug report historyserves as the process event log generated by the ITS.We extract Status, Resolution (only for closed), As-signee, QA Contact and Component from bug historyto derive the process map. In our experiments, werecord creation timestamp for the activity Reported.

  • Author(s) YearSoftwareRepository Project

    OSS/CSS

    Objective

    Kindler et.al. [3] 2006 SCM

    Dummy versioninglog to make it in-dependent of SCMsystem

    OSS

    Proposed incremental workflow min-ing algorithm to semi-automaticallyderive process model from versioninginformation.

    Weijterset. al. [10] 2006

    Dummy eventlog with

    random tracesNA NA

    Distinguished three mining perspec-tives and focussed on the process per-spective. Described and analysedHeuristicsMiner algorithm

    Rubin et.al. [7] 2007 SCM, VCS AgroUML OSS

    Linear Temporal Logic (LTL) Check-ing (Conformance Analysis), So-cial Network Discovery, PerformanceAnalysis, Petri Net Discovery

    Akman et.al. [1] 2009 SCM

    Industry project(for TurkishArmed Forces)

    CSS

    Analyzed and compared the effective-ness of four process discovery algo-rithms on software process. Analyzeddiscrepancies between real time anddesign time process.

    Knab et.al. [4] 2010

    Issue Track-ing System

    Industrial Project(EUREKA/ITEAproject SERIOUS)

    CSS

    Interactive approach to visualise ef-fort estimation and process lifecyclepatterns in ITS to detect outliers,flaws and interesting properties.

    Poncin et.al. [5] 2011

    Bugrepositories,

    mail archives,SVM

    aMSN, GCC OSS

    Combined different repositories foranalysis using a prototype, FRASR.Role classification and Bug life cycleconstruction using ProM.

    Sunindyoet. al. [9] 2012

    DefectTrackingSystem

    Red Hat En-terprise Linux(RHEL)

    OSS

    Framework for collecting and analyz-ing data from bug reporting system,conformance checking to improve pro-cess quality.

    Table 1: Literature survey of papers at the intersection of mining software repositories and process mining arranged inchronological order.

    For open bugs, status can be New, Unconfirmed, As-signed and Reopened which is captured as activity. Forclosed bugs, Verified status is captured as it is whilefor the status Resolved, the resolution is captured asactivity. The resolution can be Fixed, Invalid, Wont-fix, Duplicate, Worksforme and Incomplete5. We con-sider every closed resolution as a unique event for de-tailed analysis as it will also capture the transitions be-tween different resolutions. Assignee, QA Contact andComponent are recorded as Dev-reassign, QA-reassignand Comp-reassign respectively as they are importantevents in a bug lifecycle and can lead to change instate like a bug should be marked New after developerreassignment6. Based on the mentioned significanceof captured activities, we hypothesize that they areimportant for good characterization of bugs control-flow (the tasks performed during the progression of abug). They cover the tasks performed starting frominception to closing of bug. We obtain timestamp cor-responding to an activity from the when field of bughistory as can be seen in Figure 2 to order the activ-ities in the sequence of their actual execution (whilegenerating the process map via Disco). The performerof the activity is treated as resource and is capturedfrom the who field of bug history.

    2. Data transformation: One of the major challenges

    5https://bugzilla.mozilla.org/page.cgi?id=fields.html#status6http://www.bugzilla.org/docs/2.18/html/lifecycle.html

    in software process mining is to produce a log con-forming to the input format of process mining tool[5]. Therefore, data preprocessing is very crucial be-fore analysis in a process mining tool. Following is theprocessing performed to address the challenges:

    Activities with timestamp, bug ID and associatedresource make the event log for process map gen-eration.

    Reported event is not stored in bug history so it isadded to the event log with creation timestamp.

    Initial state is not captured in history if no ac-tivity leading to change in status and resolutionis performed. For such cases only Reported (withcreation timestamp) is stored in event log and notthe initial state. Otherwise, initial state is alsostored with the same timestamp as Reported. Forinstance, the process map from the history of abug is shown in Figure 2.

    There are cases when Comp-reassign, Dev-reassignand QA-reassign have same the timestamp withother activity. This same timestamp conflict isresolved by subtracting delta from the timestampof these activities such that the final ordering inall conflicts (except Reopen) should be:Comp-reassign QA-reassign Dev-reassign Other activityThis is because component assignment triggers

  • bugzilla.mozilla.org

    Data Extraction

    RAW DATA (MYSQL) EVENT LOG

    Data Transformation

    Case ID, Timestamp

    Activity, Resource

    Process Mining

    Discovery

    Process Model

    Verication

    Conformance

    Performance

    Analysis

    Figure 1: Nirikshan: Proposed research framework with three major phases as 1. data extraction, 2. data transformation,and 3. process mining.

    PROCESS MAPSNAPSHOT OF MOZILLA BUG HISTORY (BUG ID 600028)

    Event Log

    Activity

    Resource

    Timestamp

    Figure 2: A snapshot of Mozilla bug report history (event log) and corresponding process map.

    developer and QA assignment7 followed by activ-ities for bug progression. However, Reopen is thereactivation of a bug rather than filing a new bugso it is resolved as:Reopen Comp-reassign QA-reassign Dev-reassign.

    3. Process mining: The event log can be mined fromvarious perspectives [10]. We perform process miningto discover runtime process model (process perspec-tive), analyze performance and verify conformance ofas-is model with the design time process model. Weanalyze statistics for activity, transitions, events andunique traces from discovered process map. The per-formance analysis is done for reopening of bugs, pres-ence of loops, back-forth transitions and identifyingbottlenecks.

    4. PROCESS DISCOVERYThere are many process mining tools such as ProM (open

    source) and Disco8 (commercial) used to obtain process model.We import preprocessed data into Disco to obtain processmap and other statistical information for Core and Firefox.Disco miner is based on the proven framework of the FuzzyMiner with completely new set of process metrics and mod-eling strategies9. It is used to discover the runtime processfrom the actual event log generated during the progressionof a bug. Bug ID is selected as case (for process map gen-eration in Disco) to associate all activities pertaining to the

    7http://www.bugzilla.org/docs/2.20/html/components.html8Disco is a proprietary tool for which we availed academiclicense and used for our analysis.9http://fluxicon.com/disco/files/Disco-Tour.pdf

    same bug ID and hence, we can visualize the lifecycle of abug. We have 15 nodes each corresponding to an activityin the process map for both the projects and the directededge represents a transition between two activities (node).With 15 nodes, we can have a maximum of 210 unique tran-sitions including self-loop given Reported as a source andhence no inward arrow possible. The process map obtainedis fairly complex at the highest resolution (including infre-quent transitions) with 160 and 156 unique transitions forCore and Firefox respectively. For clarity, the process mapat 20% resolution (core transitions) for Firefox is shown inFigure 3. Label of each edge indicates the absolute frequencyof transition. The shade and thickness corresponds to thefrequency with more frequent being dark and less frequentas light.

    4.1 Activity Frequency AnalysisThe relative occurrence of all activities with absolute fre-

    quency is shown in Table 3. After Reported (exists for all thecases), the most frequent event is Unconfirmed for Firefoxand New for Core. This reduces the confirmation efforts forreported bugs in Core. Figure 4 shows the distribution ofactivities over cases, that is, how much percentage of caseshave the given activity in their event log.

    As evident from Figure 4, a good percentage of cases haveDev-reassign and Comp-reassign event in lifecycle, also theirrelative frequency is high which is undesired. We see thatQA reassignment is happening quite often. Developer reas-signment is much higher for Core than Firefox. It leads tothe wastage of developers time and efforts and delay bugfixing. Hence the efforts should be made to minimize reas-signments. A bug once resolved should be verified. However,only around 4% of bugs are verified by the QA manager forboth the projects which is significantly less than the resolved

  • Figure 3: Process map for Firefox at 20% resolution with labels as absolute frequency and shade proportional to frequency.

    bugs signalling inefficiency. It highlights the need to uncoverreasons for infrequent verification and address them.

    Further, we can clearly observe from Figure 4 that Corehas almost double chances of getting a bug Fixed which isthe recommended situation. While for Firefox, compara-tively a large number of bugs are marked Duplicate, Invalidand Worksforme which means a lot of practitioners timeis going in addressing issues less useful for overall productquality improvement. To avoid duplicates, an efficient searchmechanism can be used before reporting a bug or an exten-sion to give Duplicate bug warning at the time of bugsubmission. Similarly, clarity on what should be reportedas bug to reduce invalid bugs and emphasize need for cleardescription of bugs for better understanding.

    4.2 Transition Frequency AnalysisIntuitively, frequent transitions should be the part of de-

    sign process. We find that frequent transitions are usuallypart of lifecycle defined for Bugzilla. However, there aresome which are infrequent even though allowed in designtime process namely, Verified Reopen with only 14 and4 cases, Reopened Assigned with 20 and 8 cases for Coreand Firefox respectively. Sometimes this may be because ofthe fact that source or destination activity is less frequent.Other reason can be that majority of the cases have transi-tion from a given state to a specific state, so the frequency

    0 20 40 60 80 100

    WontfixIncomplete

    VerifiedWorksforme

    InvalidDuplicate

    FixedReopened

    QAreassignDevreassign

    AssignedCompreassign

    NewUnconfirmed

    Reported

    Percentage of cases with activity in event log

    Activ

    ity

    CoreFirefox

    Figure 4: Activity distribution over cases showing the per-centage of cases with given activity in event log.

    for other possible transitions become low and hence givesan impression that it is not part of design process. The in-frequent occurrence of Assigned New transition impliesthat there is a non-adherence to defined process, that is,a bug should be marked New after Dev-reassign but it isnot practiced always as the frequency of transition (25 forFirefox and 30 for Core) is much less than the frequency ofdeveloper reassignment for both the projects. Also, Verified

  • ActivityFirefox

    Rel freq. (Abs)Core

    Rel freq. (Abs)Reported 30.41 (12234) 27.44 (24253)

    New 11.42 (4596) 17.27 (15267)Fixed 6.93 (2788) 14.22 (12573)

    Dev-reassign 6.65 (2677) 12.98 (11471)Comp-reassign 7.16 (2880) 6.61 (5843)

    Assigned 4.48 (1802) 5.79 (5121)QA-reassign 3.23 (1300) 3.50 (3095)Unconfirmed 12.16 (4894) 3.27 (2893)

    Duplicate 5.85 (2354) 2.62 (2319)Works for me 2.93 (1180) 1.81 (1599)

    Verified 1.27 (512) 1.22 (1081)Invalid 4.21 (1692) 1.22 (1075)

    Reopened 0.82 (330) 1.18 (1045)Wont Fix 1.23 (494) 0.66 (587)Incomplete 1.24 (500) 0.20 (174)

    Total 40,233 88,396

    Table 3: Relative frequency percentage (Rel freq.) and ab-solute frequency (Abs) for all 15 activities.

    Unconfirmed transition defined in design process, neveroccurred in runtime so the resources for maintaining thisdefined transition are never used.

    Mostly undefined transitions are infrequent with few ex-ceptions. Such transitions are analyzed and if found to havegood influence on overall efficiency, they can be added aspart of design process otherwise steps should be taken toavoid such transitions in future. For instance, Reported Assigned is occurring in 1659 and 325 cases for Coreand Firefox respectively even though not defined in designprocess. We analyze such cases to determine if chances ofdeveloper reassignment are high or low as the bug is di-rectly marked Assigned at the time of reporting. 4.88% and14.77% cases have developer reassignment when assigned atthe time of reporting which is less than the 47.87% and18.38% cases for indirect transition with respect to Coreand Firefox respectively. It means that the reporter hadclear idea on who can be the right person to fix bug andassigned directly resulting in comparatively less chances ofreassignment.

    4.3 Event AnalysisA sequence of events is performed after a bug is reported

    till the time it is closed. The number of events varies fromcase to case. We use only closed cases for our analysis.The bugs which are resolved and not reopened thereafterare taken as closed. The sequence of events till a bug isresolved is taken into consideration and the events there-after are ignored. There are total 16989 and 8259 closedcases with 71615 and 31797 events for Core and Firefox re-spectively. Maximum number of cases have 3 events in thelifecycle as indicated in Figure 5 for both the projects. Thelifecycle with 3 events can be represented as Reported New/Unconfirmed Resolved which is the shortest possiblesequence of events for resolution. We uncover that out of allthe closed cases, majority of them executed minimum eventsto get resolved for both the projects. Very less bugs(cases)have more than 10 events in the lifecycle which shows sta-bility. The maximum number of events for a case is 22 forcore and 21 for Firefox respectively which is an outlier.

    4.4 Unique Traces

    3 4 5 6 7 8 9 10 >100

    10

    20

    30

    40

    50

    60

    Events per case

    Perc

    enta

    ge o

    f cas

    es

    CoreFirefox

    Figure 5: Events per case with maximum cases having 3events in the lifecycle.

    The complete sequence from Reported till last event makesthe trace for a case (bug). Different bugs can have differentset of transitions and hence different traces. We considera trace unique even if the single transition is different. Forinstance, a trace with sequence A B C is different fromA B B C because second case has a self loop whichis not there in first case. Only the closed bugs (same asevent analysis) are used for experiments. Effectively, thereare 1164 and 622 variants for Core and Firefox respectively.However, most of them are infrequent. The distribution isalmost same for both the projects. For Core and Firefox,80% of the cases are covered with around 2% (include variantwith atleast 50 cases) and 3% variants (include variant withatleast 29 cases) respectively. Whereas 8% and 13% of thevariants are required for coverage of 90% cases. Top tenmost frequent variants with percentage of cases are shownin Table 4. Efforts should be made to minimize delay in thetransitions involved in these traces as they constitute pathfollowed by majority of the closed cases.

    5. PERFORMANCE ANALYSIS

    5.1 Reopen AnalysisReopened bugs10 increase the maintenance costs, degrade

    overall user-perceived quality of the software and lead to un-necessary rework by busy practitioners [8]. If fair numberof fixed bugs are reopened, it could indicate instability inthe software system [11]. Bug reopens have a significant im-portance in both open source and commercial software sys-tems. Understanding bug reopening is of significant interestto the practitioners community in order to characterize ac-tual quality of the bug fixing process [11]. Therefore, theanalysis of reopened bugs will help process owner to takepreventive actions to minimize the reopening of bugs.

    For both the products, of all the labels, Wontfix is morelikely to be reopened in comparison with others as shownin Figure 6a. Also there are very few cases reopened afterQA-reassignment, component reassignment and developerreassignment. Shihab et. al. [8] showed that the last sta-tus of the closed bug is an important influential factor forreopening. The reasons [11] [8] for reopening of the closedbug with given label could be:

    Wontfix/Invalid/Incomplete/Worksforme: Clear stepsto reproduce and additional information becomes avail-

    10According to Bugzilla@Mozilla, the bug was once resolved,but the resolution was deemed incorrect.

  • S.No. Firefox Core%age Trace %age Trace

    1 14.24 RepUnconfirmedDuplicate 18.71 RepNewDev-RFixed2 12.06 RepUnconfirmedInvalid 14.34 RepNewFixed3 9.55 RepNewDev-RAssignedFixed 12.36 RepNewDev-RAssignedFixed4 6.27 RepNewDuplicate 8.48 RepNewFixed5 5.21 RepUnconfirmedWorksforme 6.62 RepNewDuplicate6 5.13 RepNewFixed 4.99 RepNewWorksforme7 4.15 RepNewDev-RFixed 2.39 RepNewInvalid8 3.95 RepNewWorksforme 1.44 RepNewWontfix9 3.86 RepUnconfirmedIncomplete 0.98 RepUnconfirmedInvalid10 2.88 RepAssignedFixed 0.94 RepNewDev-RDev-RFixed

    Table 4: Top 10 most frequent event traces with case frequency percentage.

    0 2 4 6 8 10

    Duplicate

    Fixed

    Incomplete

    Invalid

    Verified

    Wontfix

    Worksforme

    Percentage of transition to Reopen

    Sour

    ce S

    tate

    CoreFirefox

    (a) Percentage of issues reopened fordifferent states.

    61.15%

    0.76%5.55%

    0.57%

    6.41%

    1.05%

    10.90%

    1.34%

    12.06%0.19%

    FixedDevreassignWontfixQAreassignInvalidIncompleteWorksformeVerifiedDuplicateCompreassign

    (b) Contribution of source activitiesin total Reopened for Core.

    52.42%

    0.61%11.82%

    0.61%7.27%

    1.51%

    9.70%

    1.21%

    14.54%0.30%

    (c) Contribution of source activitiesin total Reopened for Firefox.

    Figure 6: Reopen Analysis

    able to better understand reported bug and determineroot cause. Also, if the priority of bug is underesti-mated and hence closed.

    Duplicate: A bug accidentally marked duplicate due tosimilar symptoms or title matching with the existingbug without proper understanding of the root cause.

    Fixed: Incompletely or incorrectly fixed bugs due topoor root cause understanding, regression bug (the bugreappears in the new version) or the boundary casesmissed by developer during testing.

    Verified: Incorrectness realized later or extra informa-tion becomes available which triggers reopening of bug.

    Figure 6a shows percentage of bugs getting reopened forall the possible closed states. The interpretations from thefigure are as follows:

    Comparatively high percentage of bugs closed with la-bel Wontfix, Invalid, Incomplete and Worksforme aregetting reopened for Core. To improve the qualityand understanding, more efforts should be made toensure that sufficient information is retrieved from thereporter before closing. This can be done in two ways:1. being interactive and clarifying things with thereporter after a bug has been reported, and 2. im-prove initial template to capture sufficient details be-forehand, to reduce the delay. Disagreement in thepriority of bugs should be minimized by defining clearguidelines to decide priority.

    We see that the chances of getting a Duplicate bugreopened in Core (around 5%) are more than double of

    Firefox (around 2%). Thorough understanding of bugshould be done before marking a bug Duplicate. Thedecision should not be based on superficial attributes.

    After Wontfix, bugs resolved as Fixed are prone toreopening in Firefox (6%) unlike Core (with Works-forme). A proper understanding of root cause is neces-sary for fixing a bug. Assumptions related to reportedbug should be avoided. If unclear, the owner shouldask the reporter for clarification and necessary infor-mation. Minimal required testing of fixed bugs shouldbe performed to avoid failures at later stages.

    There are very less chances of reopening for Verifiedbugs (1% or less). So, the QA manager should beencouraged to verify final resolution carefully.

    Of all the reopened bugs, the major contribution is fromFixed, Duplicate, Wontfix, Worksforme and Invalid. Rest allconstitutes very less percentage as evident from Figure 6cfor Firefox and Figure 6b for Core. Since the total numberof Fixed bugs is high, even if less percentage of Fixed getreopened, it contributes more towards total reopened bugs.Hence the reasons for reopening of Fixed bugs should bedealt with high priority.

    5.2 Self-loop and Back-forth AnalysisIt is important to study recurrent loops as they are in-

    dicators of deeper problems. It is difficult to detect suchping-pong patterns where a bug is passed repeatedly with-out any progress [2]. Self-loop is an edge that starts and endsat the same node (activity). It shows consecutive repetitionof the same activity. Presence of loops is undesired as itindicates imperfection. Diagonal elements of the confusion

  • matrix shown as Table 6 denotes absolute number of loopswhere first entry corresponds to Core and second to Firefox.The process map obtained for Core and Firefox have loopmainly for Comp-reassign, Dev-reassign and QA-reassign asobserved in Table 6. For Verified, there are 9 and 3 casesfor Core and Firefox respectively which is very less and canbe ignored. Most of the cases have one loop, that is, thedecision made in first attempt could be imperfect and needscorrection so same activity repeated two times. Maximumloops are for developer reassignment (1296/222) and com-ponent reassignment (471/257). It is not easy to decide thecomponent to which bug pertains and the person to whomthe bug should be assigned. The number of loops are morethan one in some cases which means a lot of time is going inmaking decision for right assignment and component identi-fication. Loops cause the undesired delay which is as high as50.25 days and 30.35 days for Firefox and Core respectivelyin case of developer reassignment as shown in Table 5.

    Firefox CoreDev Comp QA Dev Comp QA

    Mean 50.2 12.7 13.7 32.3 12.2 13.9Median 9.8 0.2 0.3 5.4 0.2 0.8

    Min 8 s 9 s 4 s 7 s 12 s 4 s1Q 13.0h 18.7m 1.0m 12.2h 21.6m 10.4m3Q 65.9 2.6 19.4 28.7 2.4 6.9

    Max 437.8 334.9 133.1 546.7 450.5 161.1SD 84.8 41.5 28.3 67.2 43.4 34.5

    Table 5: Duration (unit: days-default, hours (h), minutes(m), second (s)) for loops in given states where Min: Min-imum, 1Q: 1st quarantile, 3Q: 3rd quarantile, Max: Maxi-mum, SD: Standard Deviation.

    Back-forth: A transition from a state A to another stateB and then back to state A is defined as one back-forth loop.We observe back-forth phenomenon between pair of activ-ities (also referred to as a ping-pong pattern in the litera-ture) in the runtime process model. Table 6 shows resultsof our empirical analysis for identifying back-forth pattern.Each cell of the Table 6 matrix (except diagonal values)shows frequency of back-forth loop between activity pairs(14 14 matrix in which the row activity is state A andthe column activity is state B of back-forth loop). We no-tice that Unconfirmed, Component-reassignment, Developer-reassignment, QA-reassignment and Reopen are activitieswhich are frequently involved in back-forth pattern. Ourexperimental results show that a Fixed bug being reopenedand then again resolved as Fixed is frequent (refer to Table6: 466 times for Core and 114 times for Firefox) indicat-ing regression bug or imperfection. A back-forth loop phe-nomenon for Invalid, Worksforme, Duplicate and Wontfixstate with Reopened shows disagreement between develop-ers. One explanation can be that the team members are notconvinced with initial decision and reopened the bug result-ing in loss of productivity and increasing the mean-time torepair. Back-forth between Unconfirmed and any resolutionis permitted according to the pre-defined Bugzilla lifecycleand we also observe in as-is process.

    5.3 Bottleneck AnalysisA bottleneck is defined as a specific part within the run-

    time process map which is relatively time consuming andreduces the overall performance of the end-to-end process.

    Bottleneck identification (most time consuming states, ac-tivities and transitions) from an as-is process model is anactionable information for the process owner (from the per-spective of root-cause analysis and process improvement).We compute the mean-time for every transitions defined inthe bug lifecycle and make the following observations:

    1. Our analysis shows that for Firefox project, confirma-tion to New state takes 33.4 days (mean value). Incontrast, it takes on an average 19.4 days for directassignment (directly to Assigned state). Confirmationseems to be a bottleneck (and hence can be expeditedas it is resulting in delays). We observe that for Coreproject, the mean time taken for confirmation to Newis 9.3 days which is relatively much less than Firefox.

    2. The time taken for the transition NewAssigned is17.1 days and 17.4 days for Core and Firefox respec-tively. This means on an average, there is a delay ofaround 17 days in assignment of bug from New state.

    3. An Assigned bug is labelled as New in-case of a devel-oper reassignment event (restart). Experimental resultshows that the time-elapsed between two states is 57.4days for Firefox and 6.4 days for Core. This shouldbe avoided by correct developer assignment in first at-tempt or quick reassignment should be done if wronglyassigned because one wrong assignment will delay res-olution for Firefox by 57.4 days on an average.

    4. Resolution of a bug as Duplicate from possible sourcestate: Unconfirmed, Assigned (not for Core), Reopenedand New is taking comparatively less time for both theprojects indicating efficiency in duplicate bug reportidentification. For Firefox, it is effective also as thechances of reopening are fairly low (Figure 6a).

    5. Experimental results show that identification of caseswhere resolution is Worksforme, Wontfix and Incom-plete is more time consuming irrespective of sourcestate from which resolution is made. We believe theactionable information is that more efficiency improve-ment is required in their identification. For instance,resolution from New to Worksforme and Wontfix istaking exceptionally long duration of 19.3 week and31.1 week respectively for Firefox and 20.5 week and81.7 days for Core. For Core, Assigned to Wontfix istaking even more, that is, 14.8 weeks.

    6. For both the projects, resolving a bug as Fixed takesless time from Assigned as the developer has alreadytaken possession and worked to fix it. When Fixeddirectly from Unconfirmed or New, it takes more time.

    7. The chances of Verifed bug getting reopened are veryless and takes 8.3 and 18.9 days on an average to re-open verified bug for Firefox and Core respectively.However, for Firefox Wontfix, Fixed and Worksformeare more likely to be reopened and there is delay of14.9, 13.9 and 45.6 days in reopening of bugs from re-spective states. For Core, there is comparatively moredelay in reopening of bugs resolved as Worksforme,Wontfix and Incomplete (24.1, 18.3 and 38.1 days re-spectively) and have good chances of getting reopened.Therefore, the reasons for their reopening should bedealt with more priority.

  • Activity UN New CompR DevR QAR Asg Invalid ReO Wfm InC WF Dup Fixed VerUN 3/5 22/53 3/22 5/8 1/16 13/35 /2New 13/5 2/1

    CompR /2 75/17 471/257 66/4 245/90 /1 6/7 3/3 1 2/1 7/4 6/2

    DevR 3/ 66/ 92/ 1296/ 21/ 172/ 2 2/ 9/1 21 17 222 3 99 2 1QAR 1 192/65 13/1 88/29 8/2 1 5 3Asg 16/7

    Invalid 13/22 10/3 1 /1ReO 5/2 2/2 5/7 4/1 52/16Wfm 2/9 17/8 /1 2InC 4/4 1 /1 1WF 4/15 9/12 /1Dup 8/25 1/1 13/4 /1 /1 1/1

    Fixed 7/ 466/ 63 114Ver 9/3

    Table 6: Loop and Back-forth confusion matrix where 1st entry corresponds to Core and 2nd to Firefox (blank if no occurrence).

    6. CONFORMANCE TESTINGConformance checking aims at detection of inconsistencies

    between design time process model and the model obtainedfor as-is process from runtime event log. We define metricsto measure fitness, that is, how well observed process com-plies with the control flow defined in the design time processmodel and the point of inconsistency. We believe that it canbe useful for identification and removal of obsolete partswhich demands extra maintenance efforts [6].

    We propose an algorithm 1 to evaluate fitness metric. Wefind number of cases with valid traces (only defined transi-tions) and its ratio with total cases to measure the extentof fitness. Event log and adjacency matrix, A (with row assource state and column as destination state, that is, 1515in our case) has 1 in the cell if transition is permitted other-wise 0, are given as input. We obtain the event trace (arraywith states from event log in sequential order of their occur-rence) for each case ID as shown in step 4 of algorithm 1. Foroptimization, we identify unique traces and count frequencyof each. It is useful as most of the cases have same trace(refer Table 4) and we need not validate each case individu-ally. Each unique trace is verified with adjacency matrix Afor conformance. If it has all permitted transitions then thevalid bit Vi is assigned value 1 else 0. If the evaluated valueof fitness metric (refer to line 21 of algorithm 1) is less than1 then there is deviation from defined model. To detect thecause of inconsistency, inconsistentDetector() (algorithm 2)is invoked which takes event log and adjacency matrix, Aas input and returns inconsistency metrics. We create foot-print matrix (refer step 10 of algorithm 2) of runtime eventlog and compare with design model (adjacency matrix) toidentify inconsistent transitions (non-zero elements in ITF ).All states (15 for our case) are stored in an array, state. Thefrequency of transition between each pair of states is countedfrom event log and stored in Transition Frequency, TF ma-trix. Total inconsistent transitions are evaluated by addingthe elements of ITF (in line 14 of algorithm 2). We evaluatethe frequency (maximum element of ITF ) and most frequentinconsistent transition in step 15 and 16 of algorithm 2.

    To perform experiments, we consider only closed casestill last resolution state. Since we consider more activitiesthan defined in bugzilla lifecycle, we define a design model(adjacency matrix) with extra states based on practitioners

    inputs and our understanding. The value of fitness metric,FM using algorithm 1 is 0.86 and 0.91 for Core and Firefoxrespectively. It clearly shows that Firefox has high con-formance as compared to Core. However, both the projectshave good compliance with defined design model. 818 tracesout of 1164 total unique traces are valid for Core. Similarly,403 traces out of 622 are valid for Firefox which implies thatinvalid traces usually have low frequency. As the FM

  • Algorithm 1 evaluateFitnessMetric()

    Require: Event log, Adjacency matrix AEnsure: Fitness measure1: while not at the end of Event log do2: entries with Case ID i3: if tsC > tsB > tsA > tsReported then4: Trace, Ti = {Reported,A,B,C};5: end if6: end while7: Count frequency of each unique trace UTi as Fi;8: while UTi do9: m := size of trace;

    10: p := UTi[1]; q := UTi[2]; valid bit for UTi, Vi = 1 ;11: while p < m do12: if A[p][q] == 1 then13: p++;14: q++;15: else16: Vi = 0;17: break;18: end if19: end while20: end while21: Calculate Fitness metric:

    FM =

    Ni=1(Fi Vi)Ni=1(Fi)

    where N=Number of unique traces.22: if FM