detecting antipatterns in android apps

24
HAL Id: hal-01122754 https://hal.inria.fr/hal-01122754v2 Submitted on 6 Mar 2015 HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés. Detecting Antipatterns in Android Apps Geoffrey Hecht, Romain Rouvoy, Naouel Moha, Laurence Duchien To cite this version: Geoffrey Hecht, Romain Rouvoy, Naouel Moha, Laurence Duchien. Detecting Antipatterns in Android Apps. [Research Report] RR-8693, INRIA Lille; INRIA. 2015. hal-01122754v2

Upload: others

Post on 16-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

HAL Id: hal-01122754https://hal.inria.fr/hal-01122754v2

Submitted on 6 Mar 2015

HAL is a multi-disciplinary open accessarchive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come fromteaching and research institutions in France orabroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, estdestinée au dépôt et à la diffusion de documentsscientifiques de niveau recherche, publiés ou non,émanant des établissements d’enseignement et derecherche français ou étrangers, des laboratoirespublics ou privés.

Detecting Antipatterns in Android AppsGeoffrey Hecht, Romain Rouvoy, Naouel Moha, Laurence Duchien

To cite this version:Geoffrey Hecht, Romain Rouvoy, Naouel Moha, Laurence Duchien. Detecting Antipatterns in AndroidApps. [Research Report] RR-8693, INRIA Lille; INRIA. 2015. �hal-01122754v2�

ISS

N02

49-6

399

ISR

NIN

RIA

/RR

--86

93--

FR+E

NG

RESEARCHREPORTN° 8693March 2015

Project-Teams Spirals

Detecting Antipatterns inAndroid AppsGeoffrey Hecht, Romain Rouvoy, Naouel Moha, Laurence Duchien

RESEARCH CENTRELILLE – NORD EUROPE

Parc scientifique de la Haute-Borne40 avenue Halley - Bât A - Park Plaza59650 Villeneuve d’Ascq

Detecting Antipatterns in Android Apps

Geoffrey Hecht, Romain Rouvoy, Naouel Moha, LaurenceDuchien

Project-Teams Spirals

Research Report n° 8693 — March 2015 — 20 pages

Abstract: Mobile apps are becoming complex software systems that must be developed quicklyand evolve continuously to fit new user requirements and execution contexts. However, addressingthese constraints may result in poor design choices, known as antipatterns, which may incidentallydegrade software quality and performance. Thus, the automatic detection of antipatterns is animportant activity that eases both maintenance and evolution tasks. Moreover, it guides developersto refactor their applications and thus, to improve their quality. While antipatterns are well-knownin object-oriented applications, their study in mobile applications is still in their infancy. In thispaper, we propose a tooled approach, called Paprika, to analyze Android applications and to detectobject-oriented and Android-specific antipatterns from binaries of mobile apps. We validate theeffectiveness of our approach on a set of popular mobile apps downloaded from the Google PlayStore.

Key-words: Android, antipattern, mobile app, software quality

Détection d’anti-patrons dans les applications AndroidRésumé : Les applications mobiles deviennent des systèmes logiciels complexes qui doiventêtre développés rapidement et évoluer continuellement pour s’adapter aux nouvelles exigences desutilisateurs et à de multiples contextes d’exécution. La réponse à ces changements peut mener àde mauvaises solutions de conceptions ou d’implémentations, connues sous le nom d’anti-patrons,qui peuvent dégrader la qualité du logiciel ainsi que ses performances. Par conséquent, la détec-tion automatique de ces anti-patrons est importante pour faciliter les tâches de maintenance etd’évolutions des applications. Cela peut aussi aider les développeurs à réusiner leurs applicationset par conséquent augmenter leurs qualités. Bien que les anti-patrons soient bien connues pourles applications orientés objets, leur étude pour les applications mobiles est encore à ses balbu-tiements. Dans ce rapport, nous proposons une approchée outillée nommée Paprika qui permetd’analyser les binaires d’applications Android afin de détecter des anti-patrons orientés objets etspécifiques à Android. Nous validons l’efficacité de notre approche sur un ensemble de plusieursapplications populaires téléchargées depuis le Google Play Store.

Mots-clés : Android, anti-patron, application mobile, qualité logicielle

Detecting Antipatterns in Android Apps 3

Contents1 Introduction 4

2 Background on Android Packageand Bytecode 5

3 Related Work 5

4 Paprika: A tooled Approach to de-tect Software Anti-Patterns 74.1 Overview of the Approach . . . . 74.2 Step 1: Collecting Metrics from

Application Artifacts . . . . . . . 74.3 Step 2: Converting Paprika

Model as a Graph Model . . . . . 104.4 Step 3: Detecting Anti-patterns

from Graph Queries . . . . . . . 10

5 Empirical Validation 125.1 Research Questions . . . . . . . . 125.2 Subjects . . . . . . . . . . . . . . 135.3 Objects . . . . . . . . . . . . . . 135.4 Evaluation Protocol . . . . . . . 135.5 Preliminary Assessments . . . . . 145.6 Analysis Results . . . . . . . . . 15

5.6.1 RQ1 Can antipatternsoriginating from thesource code be detectedat the bytecode level? . . 15

5.6.2 RQ2 To what extent mo-bile apps exhibit OO an-tipatterns? . . . . . . . . 15

5.6.3 RQ3 To what extentmobile apps exhibitAndroid-specific antipat-terns? . . . . . . . . . . . 16

5.6.4 RQ4 Are OO andAndroid-specific antipat-terns present in the sameproportion? . . . . . . . . 16

5.6.5 Other observations . . . . 175.7 Threats to validity . . . . . . . . 17

6 Conclusion and future work 17

RR n° 8693

Detecting Antipatterns in Android Apps 4

1 Introduction

Along the last years, the development of mo-bile applications (apps) has reached a great suc-cess. In 2013, Google Play Store1 reached over50 billion app downloads [3] and is estimatedto reach 200 billion by 2017 [6]. This suc-cess is partly due to the adoption of establishedObject-Oriented (OO) programming languages,such as Java, Objective-C or C#, to developthese mobile apps. However, the developmentof a mobile app differs from a standard onesince it is necessary to consider the specifici-ties of mobile platforms. Additionally, mobileapps tend to be smaller applications, which relymore heavily on external libraries and reuse ofclasses [28, 32, 40].

In this context, the presence of common soft-ware anti-patterns can be imposed by the un-derlying frameworks [24, 38]. Software antipat-terns are bad solutions to known design is-sues and they correspond to defects related tothe degradation of the architectural propertiesof a software system [16]. Moreover, antipat-terns tend to hinder the maintenance and evo-lution tasks, not only contributing to the tech-nical debts, but also incurring additional costsof development. Furthermore, in the case ofmobile apps, the presence of antipatterns maylead to resource leaks (CPU, memory, battery,etc.) [17], thus preventing the deployment ofsustainable solutions. The automatic detectionof such software antipatterns is therefore be-coming a key challenge to assess the quality,ease the maintenance and the evolution of thesemobile apps, which are invading our daily lives.However, the existing tools to detect such soft-ware antipatterns are limited and are still intheir infancy, at best [38].

Mobile apps are mainly distributed throughapp stores, such as Apple Store, Google PlayStore or Windows Phone Store, which do notprovide access to their source code [28]. Cat-alogs of open-source applications are availableonline2, but there is no evidence that the hostedapps are representative of the ones distributedby official app stores. Therefore, it is neces-

1https://play.google.com/store2https://f-droid.org

sary to analyze non open-source apps to acquiresubstantial and significant knowledge and dataconcerning the presence of antipatterns in mostof the mobile apps.

We thus aim at improving the quality of mo-bile apps by mining legacy apps. This paperfocuses on the analysis of official mobile appsto detect the presence of software antipatterns.In particular, we introduce Paprika as a tooledapproach to analyze the code of Android appsand to detect both common OO and Android-specific antipatterns. Paprika innovates byanalyzing the Android package of mobile apps.To the best of our knowledge, this is the first ap-proach to detect common software antipatternsfrom this package. Paprika extracts qual-ity metrics from application bytecode that arestored persistently. The resulting model can befurther queried by Paprika to detect the pres-ence of software antipatterns. One example ofa detected OO antipattern is the Blob class,which is a class with a low cohesion betweenmethods and a large number of attributes andoperations [16]. As another example, the In-ternal Getter/Setter is an Android-specific an-tipattern known to degrade the performance onthis system [1, 17]. It occurs when a methodcalls a getter or a setter of its own class. Weassess our approach by reporting on the resultswe obtained for the detection of 8 antipatternson 15 popular Android apps downloaded fromthe Google Play Store. In this study, we ad-dress the following 4 research questions:RQ1 : Can antipatterns originating from thesource code be detected at the bytecode level?Finding: Yes, by analyzing the bytecode weare able to detect the presence of antipatterns.The analysis of bytecode is efficient even whencode obfuscation is used to prevent reverse-engineering.RQ2 : To what extent mobile apps exhibit OOantipatterns?Finding: We found OO antipatterns in all an-alyzed apps. Overall, the OO antipatterns areas common in Android apps as in non-mobileapplications. However, a particularity existsfor mobile apps: the Activity class of the An-droid Framework tends to be more sensitivethan other classes.

RR n° 8693

Detecting Antipatterns in Android Apps 5

RQ3 : To what extent mobile apps exhibitAndroid-specific antipatterns?Finding: We found Android-specific antipat-terns in all analyzed apps. They are really com-mon and frequent, despite the fact that they areeasy to refactor.RQ4 : Are OO and Android-specific antipat-terns present in the same proportion?Finding: The Android-specific antipatternsare far more frequent and common than OOantipatterns.

The rest of the paper is organized as follows.We provide some background on Android pack-age and bytecode in Section 2. We compareto the related works in Section 3. The detailsof our framework are introduced in Section 4.We validate our approach empirically on 15 ap-plications in Section 5. Section 6 summarizesour work and outlines some avenues for futureworks.

2 Background on AndroidPackage and Bytecode

This section provides a short overview of thespecificities of Android Application Package(APK) and Dalvik bytecode.

Android apps are distributed using the APKfile format. APK files are archive files ina ZIP format, which are organized as fol-lows: 1. the file AndroidManifest.xml de-scribes application metadata including name,version, permissions and referenced library filesof the application, 2. the directory META-INFthat contains meta-data and certificate infor-mation, 3. an asset and a res directory con-taining non-compiled resources, 4. a lib direc-tory for eventual native code used as library,5. a resources.arsc file for pre-compiled re-sources, and 6. a .dex file containing the com-piled application classes and code in dex file for-mat [4]. While Android apps are developed us-ing the Java language, they use the Dalvik Vir-tual Machine as a runtime environment. Themain difference between the Java Virtual Ma-chine (JVM) and the Dalvik Virtual Machineis that Dalvik is register-based, in order to bememory efficient compared to the stack-based

JVM [4]. The resulting bytecode compiled fromJava sources and interpreted by the Dalvik Vir-tual Machine is therefore different.

Disassembler exists for the Dex format [8]and tools to transform the bytecode into in-termediate languages or even Java are numer-ous [11, 14, 19, 33]. However, there is an im-portant loss of information during this trans-formation for all the existing approaches. Forinstance, additional algorithms have to be usedto infer the type of local variables or to deter-mine the type of branches as for, while andif constructions are replaced by goto instruc-tions in the bytecode [4, 14]. Some dependen-cies are also absent from the Dex files, result-ing in phantom classes, which cannot be ana-lyzed without the source code. And, of course,the native code included in the lib directorycannot be decompiled with these tools. It isalso important to note that around 30% of allthe mobile apps distributed on Google Storeare obfuscated [40] in order to prevent reverse-engineering. The ProGuard tool used to obfus-cated code is even pre-installed on the beta ofAndroid Studio provided by Google to replaceEclipse ADT [2]. It is likely that code obfusca-tion will be even more common in the future.With obfuscation, most classes and methodsare renamed, often with just one or two alpha-betical characters, leading to the loss of mostof lexical properties. Fortunately, the applica-tion structure is preserved and classes from theAndroid Framework are not renamed, thus al-lowing to retrieve some information from theclasses that inherit them.

3 Related Work

In this section, we discuss the relevant litera-ture about analysis and antipatterns detectionin mobile apps.

Mobile apps are mostly developed usingOO languages, such as Java or Objective-C.Since their definition by Chidamber and Ke-merer [18], OO metrics have gained popular-ity to assess software quality. Numerous worksvalidated OO metrics to be efficient qualityindicators [13, 15, 23, 34]. This has lead

RR n° 8693

Detecting Antipatterns in Android Apps 6

to the creation of tooled approaches, such asDECOR [29] or iPlasma [26], which are usingOO metrics to detect code smells and antipat-terns in OO applications. Most of the codesmells and antipatterns, like long method orblob class, detected by these approaches are in-spired by the work of Fowler [21] and Brown etal. [16]. These approaches are compatible withJava, but since they were mostly developed be-fore the emergence of mobile apps they are nottaking into account the specificities of Androidapps and are not compatible with Dex byte-code.

With regard to mobile apps, Linares-Vásquez et al. [24] used DECOR to performthe detection of 18 different OO antipatternsin mobile apps built using Java Mobile Edi-tion (J2ME) [5]. This large-scale study wasperformed on 1, 343 apps and shows that thepresence of antipatterns negatively impacts thesoftware quality metrics, in particular metricsrelated to fault-proneness. They also foundthat some antipatterns are more common incertain categories of Java mobile apps. Con-cerning Android, Verloop [38] used popularJava refactoring tools, such as PMD [7] orJdeodorant [35] to detect code smells, likelarge class or long method in open-source soft-ware. They found that antipatterns tend to ap-pear at different frequencies in classes that in-herits from the Android Framework (called coreclasses) compare to classes which are not (callednon-core classes). For example, long methodwas detected twice as much in core classes interm of ratio. However, they did not consideredAndroid-specific antipatterns in both of thesestudies.

The detection and the specification ofmobile-specific antipatterns are still consideredas open issues. Reimann et al. [30] propose acatalog of 30 quality smells dedicated to An-droid. These code smells are mainly originatedfrom the good and bad practices documentedonline in Android documentations or by devel-opers reporting their experience on blogs. Theyare concerning various aspect like implementa-tions, user interfaces or database usages. Theyare reported to have a negative impact on prop-erties, such as efficiency, user experience or se-

curity. We chose to detect some of these codesmells with our approach, which are presentedin Section 4.4. We selected antipatterns thatcan be detected by static analysis and despitecode obfuscation. Reimann et al. are also offer-ing the detection and correction of code smellsvia the Refactory tool [31]. This tool can de-tect the code smells from an EMF model. Thesource code can be converted to EMF if nec-essary. However, we have not been yet able toexecute this tool on an Android app. Moreover,there is no evidence that all the antipatterns ofthe catalog are detectable using this approach.

Concerning the analysis of Android appsand the study of their specificities, theSAMOA [28] tool allows developers to ana-lyze their mobile apps from the source code.The tool collects metrics, such as the numberof packages, lines of code, or the cyclomaticcomplexity. It also provides a way to visu-alize external API calls as well as the evolu-tion of metrics along versions and a comparisonwith other analyzed apps. They performed thisanalysis on 20 applications and discovered thatthey are significantly different from classicalsoftware systems. They are smaller, and makean intensive usage of external libraries, whichleads to a more complex code to understandduring maintenance activities. Ruiz et al. [32]analyzed Android packages to understand thereuse of classes in Android apps. They extractbytecode and then analyze classes signature forthis purpose. They discovered that softwarereuse via inheritance, libraries, and frameworksis prevalent in mobile apps compared to regu-lar software. Xu [40] also examined APK of122, 570 applications. He determines that de-veloper errors are common in manifest and per-missions. He also analyzed the apps’ code andobserved that Java reflection and code obfus-cation are widely used in mobile apps, makingreverse-engineering harder. He also noticed theheavy usage of external libraries in its corpus ofanalyzed apps. Nonetheless, antipatterns werenot considered as part of these studies.

RR n° 8693

Detecting Antipatterns in Android Apps 7

4 Paprika: A tooled Ap-proach to detect SoftwareAnti-Patterns

In this section, we introduce the key compo-nents of Paprika, our tooled approach for an-alyzing the design of mobile apps in order todetect software antipatterns.

4.1 Overview of the Approach

Paprika builds on a three-step approach,which is summarized in Figure 1. As a firststep, Paprika parses the APK file of the mo-bile app under analysis to extract some meta-data (e.g., app name, package) and a represen-tation of the code. Additional metadata (e.g.,rating, number of downloads) are also extractedfrom the Google Play Store and passed as ar-guments. This representation is then automat-ically visited to compute a model of the code(including classes, methods, attributes) as agraph annotated with a set of raw quality met-rics (cf. Section 4.2). As a second step, thismodel stored into a graph database (cf. Sec-tion 4.3). Finally, the third step consists inquerying the graph to detect the presence ofcommon antipatterns in the code of the ana-lyzed apps (cf. Section 4.4). Paprika is builtfrom a set of components fitting these steps inorder to leverage different analyzers, databasesor antipatterns detection mechanisms.

4.2 Step 1: Collecting Metricsfrom Application Artifacts

Input: One APK file and its correspondingmetadata.Output: A Paprika quality model includingentities, metrics and properties.Description: This steps consists in generatinga model of the mobile app and extracting theraw quality metrics from an input artifact. Thismodel is built incrementally, while analyzingthe bytecode, and complemented with proper-ties collected from the Google Play Store. Fromthis representation, Paprika builds a modelbased on six entities: App, Class, Method, At-

tribute and Variable. The properties describedin Table 1 are attached as attributes to theseentities, while they are linked together by therelationships reported in Table 2.

Paprika proceeds with the extraction ofmetrics for each entity. The 34 metrics cur-rently available in Paprika are reported in Ta-ble 3. Paprika supports two kinds of metrics:OO and Android-specific. Boolean metrics areused to determine different kinds of entities,whereas integers are used for counters or whenthe metrics are aggregated. Contrary to theproperties, metrics often require computationor to process the bytecode representation. Forexample, it is necessary to browse the inheri-tance tree in order to determine if a class in-herits from some Android Framework-specificfundamentals classes, which include:

• Activity represents a single screen on theuser interface. Activity may start othersactivities from the same or a different ap-plication;

• Service is a task that runs in the back-ground to perform long-running operationsor to work for remote processes;

• Content provider manages shared data andtransfer between apps;

• Broadcast receiver can listen and respondto system-wide broadcast announcementsfrom the system or other apps;

• Application is used to maintain a globalapplication state.

Some composite metrics,such as ClassComplexity orLackofCohesionInMethods, require morecomputation based on other raw metrics, thusthey are computed at the end of the process.Implementation: We use the Soot frame-work [37] and its Dexpler module [14] to ana-lyze APK artifacts. Soot converts the Dalvikbytecode of mobile apps into a Soot inter-nal representation, which is similar to the Javalanguage. Soot can also be used to generatethe call graph of the mobile app. This modelis built incrementally by visiting the inter-nal representation of Soot, and complemented

RR n° 8693

Detecting Antipatterns in Android Apps 8

GraphDB

APK

Metadata

Step 1 : Model Generation( Section IV.B)

Constructmodel

Extractmetrics

Step 3 : Anti-pattern Detection(Section IV.D)

Determinethresholds

Executequeries

Paprikamodel

Graphmodel

Anti-patternqueries

Detectedanti-patterns

Step 2 : Model Conversion(Section IV.C)

Convertentities

Convertmetrics

Figure 1: Overview of the Paprika approach to detect software antipatterns in mobile apps.

Table 1: List of Paprika properties.Name Entities Commentsname All Name of the entityapp_key All Unique id of an applicationrating App Rating on the storedate_download App APK download datedate_analysis App Date of the analysispackage App Name of the main packagesize App APK size (MB)developer App Developer namecategory App Category in the storeprice App Price in the storenb_download App Number of downloads from the storeparent_name Class For inheritancemodifier Class public, protected or private

VariableMethod

type Variable Object type of the variablefull_name Method method_name#class_namereturn_type Method Return type of the methodposition Argument Position in the method signature

Table 2: List of Paprika relationships.Name Entities CommentsAPP_OWNS_CLASS App – ClassCLASS_OWNS_METHOD Class – MethodCLASS_OWNS_ATTRIBUTE Class – AttributeMETHOD_OWNS_ARGUMENT Method – ArgumentEXTENDS Class – ClassIMPLEMENTS Class – ClassCALLS Method – Method Method call graphUSES Method – Variable Variable access

RR n° 8693

Detecting Antipatterns in Android Apps 9

Table 3: List of Paprika MetricsName Type Entities CommentsNumberOfClasses OO AppNumberOfInterfaces OO AppNumberOfAbstractClasses OO AppNumberOfMethods OO ClassDepthOfInheritance OO Class Integer, minimum is 1.NumberOfImplementedInterfaces OO ClassNumberOfAttributes OO ClassNumberOfChildren OO ClassClassComplexity OO Class Sum of methods complexity, IntegerCouplingBetweenObjects OO Class Chidamber and Kemerer [18], IntegerLackofCohesionInMethods OO Class LCOM2 [18], IntegerIsAbstract OO Class, MethodIsFinal OO Class, Variable, MethodIsStatic OO Class, Variable, MethodIsInnerClass OO ClassIsInterface OO ClassNumberOfParameters OO MethodNumberOfDeclaredLocals OO Method Can be different from source codeNumberOfInstructions OO Method Related to number of lines in source codeNumberOfDirectCalls OO Method Numbers of calls made by the methodNumberOfCallers OO MethodCyclomaticComplexity OO Method McCabe [27], IntegerIsGetter OO Method Computed to bypass obfuscationIsSetter OO Method Computed to bypass obfuscationIsInit OO Method ConstructorIsSynchronized OO MethodNumberOfActivities Android AppNumberOfBroadcastReceivers Android AppNumberOfContentProviders Android AppNumberOfServices Android AppIsActivity Android ClassIsApplication Android ClassIsBroadcastReceiver Android ClassIsContentProvider Android ClassIsService Android Class

RR n° 8693

Detecting Antipatterns in Android Apps 10

with properties collected from the Google PlayStore. Then, Paprika proceeds with the ex-traction of metrics for each entity by explor-ing the Soot model. One should note that, inorder to optimize performance and to reduceexecution time, these steps are not executedsequentially, but rather executed in an oppor-tunistic way while visiting the Soot model.

Compared to traditional approaches for an-tipattern detection [38], using bytecode analy-sis instead of source code analysis raises sometechnical issues. For example, we cannot di-rectly access widely-used metrics, such as thenumber of lines of codes or the number of de-clared locals of a method. Therefore, we use ab-stract metrics, that are approximations of themissing ones, like the number of instructions toapproximate the number of lines of code.

Moreover, as evoked previously, many appli-cations available on Android markets are ob-fuscated to optimize size and make reverse-engineering harder. Most methods, attributes,and classes are therefore renamed with singleletters. Thus, we cannot rely on lexical data tocompute some quality metrics and we have toapply some bypass strategies. For instance, todetermine the presence of a getter or a setterwe are not observing the method names, butwe rather focus on the number and the typesof instructions as well as the variable accessedby the method.

4.3 Step 2: Converting PaprikaModel as a Graph Model

Input: A Paprika quality model with entity,properties and metrics.Output: A software quality graph modelstored in a database.Description: We aim at providing a scal-able solution to analyze mobile apps atlarge.Therefore, we use a graph database as aflexible yet efficient solution to store and querythe app model annotated with quality metricsextracted by Paprika.

Since this kind of databases are not depend-ing on a rigid schema, the Paprika model isalmost as it is described in the previous sec-tion. All Paprika entities are represented by

nodes, their attributes and metrics are prop-erties attached to these nodes. The relation-ships between entities are represented by one-way edges.Implementation: We selected the graphdatabase Neo4j [10] and we are using its Java-embedded version. We chose Neo4j because,when combined with the Cypher [9] querylanguage, it offers good performance on large-scale datasets, especially when embedded inJava [22]. Furthermore, Neo4j is also ableto contain a maximum of 235 nodes and rela-tionships, which match our scalability require-ments. Finally, Neo4j offers a straightforwardconversion from the Paprika quality metricsmodel to the graph database.

4.4 Step 3: Detecting Anti-patterns from Graph Queries

Input: A graph database containing a model ofthe applications to analyze and the antipatternsqueries.Output: Software antipatterns detected in theapplications.Description: Once the model loaded andindexed by the graph database, we use thedatabase query language to detect commonsoftware antipatterns. Entity nodes which im-plements antipatterns are returned as resultsfor all analyzed applications.Implementation: We use the Cypher querylanguage [9] to detect common software an-tipatterns as illustrated by listings 1 and 2.

All OO antipatterns are detected using athreshold to identify abnormally high valuefrom others commons values. To define suchthresholds, we collect all the values of a spe-cific metric and we identify outliers. We use aTukey Box plot [36] for this task. Figure 2 illus-trates this approach on the LCOM, number ofmethods and number of attributes, which areused in the request to detect Blob classes aspresented in Listing 1. All values superior tothe upper whisker are considered as very highwhereas all values inferior to the lower one arevery low. The upper border of the box repre-sents the first quartile (Q1) whereas the lowerborder is the third quartile (Q3), the distance

RR n° 8693

Detecting Antipatterns in Android Apps 11

between Q1 and Q3 is called the interquartilerange (IQR). The upper whisker value is givenby the formulaQ3+1.5×IQR, which is equal to12 for the number of methods in our example.It means that if the number of methods exceeds12, then it is considered as an outliers and canbe tagged as a class containing a high numberof methods. By combining the three thresholds,we are able to detect Blob classes. The usageof this statical method allows us to set thresh-olds that are specific to the input dataset, con-sequently results may vary depending on themobile apps included in the analysis process.Thus, the thresholds are representative of allapplications in the dataset and not only thecurrently analyzed application.

Currently, Paprika supports 8 antipatterns,including 4 Android-specific antipatterns:

Blob Class (BLOB) - OO A Blob class,also knows as God class, is a class with a largenumber of attributes and/or operations [16].The Blob class handles a lot of responsibilitiescompared to other classes. Attributes andmethods of this class are related to differentconcepts and processes, implying a very lowcohesion. Blob classes are also often associatedwith numerous data classes. Blob classesare hard to maintain and increase the diffi-culty to modify the software. In Paprika,classes are identified as Blob classes when-ever the metrics numbers_of_attributes,number_of_methods andlack_of_cohesion_in_methods are veryhigh. The Cypher query for this antipatternis described in Listing 1.

Listing 1: Cypher query to detect a Blob class.MATCH (cl:Class)WHERE

cl.lack_of_cohesion_in_methods > 15AND cl.number_of_methods > 12AND cl.number_of_attributes > 8

RETURN cl

Swiss Army Knife (SAK) - OO A Swissarmy knife is a class with numerous interfacesignatures, resulting in a very complex class in-terface designed to handle a wide diversity of

abstractions. This type of class is hard to un-derstand and to maintain because of the result-ing complexity [16]. A SAK is detected by Pa-prika when a class implements a large numberof interfaces.

Long Method (LM) - OO Long methodsare implemented with much more lines of codethan other methods. They are often very com-plex, and thus hard to understand and main-tain. These methods can be split into smallermethods to fix the problem [21]. Paprika iden-tifies a long method when the number of in-structions for one method is very high.

Complex Class (CC) - OO A complexclass is a class containing complex methods.Again, these classes are hard to understand andmaintain and need to be refactored [21]. Theclass complexity is calculated by summing theinternal methods complexities. The complexityof a method can be calculated using McCabe’sCyclomatic Complexity [27].

Internal Getter/Setter (IGS) - AndroidOn Android, fields should be accessed directlywithin a class to increase performance. Theusage of an internal getter or a setter con-verts into a virtual invoke, which makes theoperation three times slower than a direct ac-cess [1, 17]. Paprika is able to identify inter-nal getter and setter despite code obfuscation,and consequently identify such calls from themethod call graph. The Cypher query for thisAndroid-specific antipattern is reported in List-ing 2. This query matches two methods fromthe same class with the first one calling the sec-ond one, flagged as a getter or a setter.

Listing 2: Cypher query to detect Internal Get-ter/Setter.MATCH (m1: Method )-[:CALLS]->(m2: Method),

(cl:Class)WHERE

(m2.is_setter OR m2.is_getter)AND cl -[: CLASS_OWNS_METHOD]->m1AND cl -[: CLASS_OWNS_METHOD]->m2

RETURN m1

RR n° 8693

Detecting Antipatterns in Android Apps 12

Member Ignoring Method (MIM) - An-droid In Android, when a method does notaccess an object attribute, it is recommendedto use a static method. The static method in-vocations are about 15%–20% faster than a dy-namic invocation [1, 17]. Paprika can detectsuch methods since the access of an attributeby a method is extracted during the analysisphase. The Cypher query for this antipatternis described in Listing 3. This request explorenode properties to return non-static methodswhich are not constructor and relations to de-tect methods which are not using any classvariable nor calling other methods. Such de-tected methods could have been made static toincrease performance without any other conse-quences on the implementation.

Listing 3: Cypher query to detect Member Ig-noring Method.MATCH (m:Method)WHERE

NOT HAS(.‘is_static ‘)AND NOT HAS(m.is_init)AND NOT m-[: USES]->(: Variable)AND NOT (m)-[: CALLS]->(:Method)

RETURN m

No Low Memory Resolver (NLMR) -Android When the Android system is run-ning low on memory, the system calls themethod onLowMemory() of running activities,which are supposed to trim their memory us-age. If this method is not implemented bythe activity, the Android system kills the pro-cess in order to free memory, and can causean abnormal termination of programs [17]. Asoverridable methods of Android-specific classeslike Activity, Service, ContentProvider orBroadcast receivers are not concerned byobfuscation, consequently their presence can bechecked by Paprika.

Leaking Inner Class (LIC) - Android InJava, non-static inner and anonymous classesare holding a reference to the outer class,whereas static inner classes are not. Thiscould provoke a memory leak in Android sys-tems [17, 25]. Given that the Paprika model

LCOM number_of_methods number_of_attributes

05

1015

Figure 2: Box plots generated to define theBlob class detection thresholds.

contains all inner classes of the application anda metric to identify static class is attached toclasses, leaking inner classes are detected by adedicated query in Paprika.

5 Empirical Validation

This section describes the experimental proto-col we followed to assess Paprika as well asthe results we obtained from the analysis of 15mobile apps downloaded from the Google PlayStore.

5.1 Research Questions

The research questions we address along theseexperiments are:RQ1 : Can antipatterns originating from thesource code be detected at the bytecode level?Motivation: The compilation into bytecodeand the eventual code obfuscation imply a lossor a transformation of the information availablein the source. Java .jar files were already an-alyzed with success for OO antipatterns [24],but there is no evidence that the code was ob-fuscated and that a similar approach will beefficient when applied on Dalvik bytecode.RQ2 : To what extent mobile apps exhibit OOantipatterns?Motivation: As previously exposed, Android

RR n° 8693

Detecting Antipatterns in Android Apps 13

apps are different from non-mobile applica-tions. They are smaller and are relying moreheavily on external libraries and classes reuse.Moreover, the Android Framework could havean impact on developers pratice concerning an-tipatterns. We want to investigate if these dif-ferences have an impact on the presence of OOantipatterns.RQ3 : To what extent mobile apps exhibitAndroid-specific antipatterns?Motivation: There is a lack of studies con-cerning the presence of Android-specific an-tipatterns in mobile apps. We want to deter-mine if these antipatterns actually exists in thepublicly available apps and how frequent theyare. This knowledge can help practitioners toknow where to focus their effort during main-tenance tasks.RQ4 : Are OO and Android-specific antipat-terns present in the same proportion?Motivation: In this research question, we in-vestigate whether certain kind of antipatternsare more frequent than others. OO antipat-terns are well documented both in literatureand on Internet, but the resources are scarceconcerning Android antipatterns. We want toinvestigate if this difference is justified by a lackof knowledge due to the recency of the domainor because Android antipatterns are scarcer inmobile apps.

5.2 Subjects

We apply our Paprika approach to detect 8different antipatterns, presented in Section 4.4,they are all issued from the literature.

5.3 Objects

Witness application To validate and cali-brate our approach, we first developed a wit-ness mobile app implementing several instancesof each type of antipatterns. In particular, thisapplication implements 62 instances of the 8antipatterns we consider in this paper. Thesource code of the witness mobile app, availableon GitHub , represents 51 classes, including 6activities and 205 methods. Classes generatedduring the compilation, such as BuildConfig

and R, are included in this count.

Applications under analysis We also an-alyzed 15 free Android apps that are availablefrom the Google Play Store [12]. All these mo-bile apps and associated metadata were down-loaded in June 2014, and are summarized inTable 4. The reported size includes all the ap-plication resources, including images, data filesor third-party libraries, which the latter are notanalyzed by Paprika. We selected these appsbecause they were all ranked in the top 150apps of their categories. It should be notedthat, although this top is based on the num-ber of downloads, this number could be ratherhigh or low depending on the categories. Inorder to validate Paprika, we diversified theselected apps in terms of category, number ofdownloads, size and rating, but we also consid-ered some popular apps (e.g., Facebook, Skype,Twitter). The complexity of mobile apps there-fore varies from 3 classes (Free 5000 Movies) tomore than 9, 000 classes (Facebook).

5.4 Evaluation Protocol

We assessed our approach with a sensitivityanalysis, a comparison to other tools, and anevaluation of the impact of the obfuscation onour witness mobile app. Then, we analyzedall the mobile apps using Paprika on an In-tel Core i5-2450M and with 4GB RAM. Wefirst load all the mobile apps, which took be-tween 5 and 15 minutes per app, depending onsize and complexity. All the resulting qual-ity models were inserted in a single Neo4Jdatabase instance. Then, as described in Sec-tion 4.4, we compute the thresholds of querieswith Boxplot and we execute our 8 softwareantipatterns queries for each application usingCypher. Note that we excluded the witnessmobile app from the computation of thresh-olds, to ensure that the results are represen-tative from Google Play Store official applica-tions.

RR n° 8693

Detecting Antipatterns in Android Apps 14

Table 4: Description of mobile apps under analysisApp name Size (Mb) Downloads Rating(/5) Classes MethodsAdobe Reader 8.02 100,000,000+ 4.3 902 5,214Android Temperature 9.67 500,000+ 4 373 2,238Facebook 22.14 500,000,000+ 3.9 9,117 46,867Fitnet Apps 0.43 50+ 4.9 13 74FLV HD MP4 Video Player 12.11 1,000,000+ 4.3 364 2,332Free 5000 Movies 0.04 1,000,000+ 3 3 5Opera Mini browser for Android 0.89 100,000,000+ 4.4 182 1,830Savoir Maigrir avec J-M Cohen 1.87 50,000+ 2.9 449 2,190Simulator Laser 4.92 5,000,000+ 2.1 225 1,205Skype - free IM & video calls 17.55 100,000,000+ 4.1 2,364 12,901Superbuzzer Trivia Quiz Game 2.80 500,000+ 3.9 858 5,265Tcheck’it - Gagnez de l’argent 2.92 5,000+ 2.2 1,306 9,268Twitter 9.94 100,000,000+ 4.1 4,335 29,309Video Chat Rooms - Chat.Org 4.34 100,000+ 4.1 7 94Zoom Camera Free 0.67 5,000,000+ 4 415 2,131

5.5 Preliminary AssessmentsSensitivity analysis We performed a sensi-tivity analysis for the thresholds we computewith median, mean, Q3 and Q3 + 1.5 × IQRused as values for the detection of Blob, SAK,LM and CC. As one can observe in Table 5, theusage of Q3+ 1.5× IQR offers the best resultsfor threshold-based queries.

Table 5: Sensitivity analysis for the thresholdThreshold Precision Recall F1Median 0.348 1 0.5161Mean 0.8 1 0.8889Q3 0.7619 1 0.865Q3+1.5*IQR 1 1 1

Globally, we expected the presence of 62 an-tipatterns from the source code and discovered62 of them in the witness mobile app by ana-lyzing the APK with Paprika, which providesa precision of 1, a recall of 1 and a F1 value of1 for all the antipattern queries.

Comparison with other tools We com-pared our approach with tools dedicated tothe detection of code smells and OO antipat-terns form the source code of Java applications.PMD is a popular tool for Java applications, itanalyzes code to detect programming flaws de-

fined by rules. It currently contains 347 rules,including ExcessiveMethodLength which is sim-ilar to the long method antipattern. These rulesare mostly define with static thresholds, whichcan be adjusted in the preferences of PMD.By default, a method is flagged by the Exces-siveMethodLength rule if it contains more than100 lines. However, this value sounds arbitraryand inappropriate for mobile apps were 75%of methods have less than 15 instructions asshowed by our statistical analysis performed onour 15 applications. In particular, only 3 in-stances are detected in our witness mobile appwith this predefined threshold. Other valuescan be used as threshold, but PMD does notprovide any mechanism to help in setting thisvalue.

To address this problem, approaches likeJDeodorant or Ptidej use statistical anal-ysis on the current application to determinethresholds. In this way, JDeodorant is ableto detect one instance of Blob class using theaverage ratio of cohesion over coupling for allclasses [20]. Using the Paprika approach, wedetect two Blob classes. However, when remov-ing most of the classes to keep only these twoBlob classes, JDeodorant is not able anymoreto detect any Blob class even if these classeshave not been modified. By computing thethresholds from a crowd of mobile apps instead

RR n° 8693

Detecting Antipatterns in Android Apps 15

of a single instance, Paprika provides a robustapproach to detect antipatterns.

Impact of obfuscation on antipatterns de-tection To evaluate the impact of code ob-fuscation on our results, we performed the samedetection on our witness mobile app obfuscatedusing the popular Proguard 5.1 and StringerJava Obfuscator 1.6.11. Proguard is an open-source Java obfuscator compatible with An-droid, which is embedded in many develop-ment environments like Netbeans, EclipseMEor Google’s Android Studio. Stringer JavaObfuscator is a commercial tool that providesstring encryption and is compatible with mostIDE. Both tools support classes, methods andvariables renaming as well as code optimiza-tion. As reported in Proguard FAQ, these toolsare not performing any flow obfuscation in or-der to avoid negative effects on performanceand size. However, the optimization step canrestructure the code, including dead code re-moval. As we can observe in Table 6, Stringerhas no impact on the detection, while Proguardslightly affects the recall of our detection. Alldetected instances are true positive as shownby the precision.

The impact of Proguard on recall can be ex-plained by two factors. First, the Proguard op-timization process removes an instance of longmethod while optimizing resources concerningapplication style in the Android R class. Then,the obfuscation process prevents us from de-tecting inner classes in the bytecode, hence allleaking inner classes are not detected anymore.The impact of obfuscation is therefore limited,especially since we were able to detect innerclasses in all mobile apps studied in this paper.

5.6 Analysis Results

Table 7 summarizes the results we obtained forthe detection of 8 antipatterns from our inputdataset composed of 15 Android official appsand a witness app. The detailed results andstatistics for all mobile apps are available on-line. The antipatterns are grouped by the typeof entities concerned, either classes or methodsor entities. The integer value represents the

number of occurrences of the antipatterns inthe mobile app. The percentage is the ratioof this value with regard to the total number ofclasses, methods or activities in the mobile app.Each antipattern is attached to one instance ofthe entity type, but it should be noted that asame entity can be affected by more than oneantipattern. For example, a class can be a com-plex class and a blob class simultaneously. Wefurther exploited these results to answer our re-search questions.

Table 7: Paprika results for the detection of 8antipatterns in 15 applications.

ClassBLOB SAK CC LIC(OO) (OO) (OO) (Android)

TotalRatio

711 157 2,367 7,5093.55% 0.78% 11.83% 37.52%

Method ActivityLM IGS MIM NLMR(OO) (Android) (Android) (Android)

TotalRatio

12,592 503 22,997 12210.41% 0.42% 19.02% 39.23%

5.6.1 RQ1 Can antipatterns originatingfrom the source code be detectedat the bytecode level?

The results on our witness mobile app alreadyproved that we were able to detect antipatternsfrom bytecode. This observation can be ac-knowledged with the results on the 15 applica-tions. We computed the thresholds to be rep-resentative for all these applications. Antipat-terns were detected in all of them regardless oftheir specificities and despite code obfuscation.

5.6.2 RQ2 To what extent mobile appsexhibit OO antipatterns?

LM (10.41 % of methods) and CC (11.83% ofclasses) are frequent and common in all mo-bile apps with the same order of magnitude.Around 3% of all classes are Blobs. Our obser-vation also shows that almost one third of allactivities are considered as Blobs. Our hypoth-esis is that the design of the Android Frame-work tends to encourage the presence of "blob

RR n° 8693

Detecting Antipatterns in Android Apps 16

Table 6: Impact of obfuscation on antipatterns detectionTool Precision Recall F1Stringer obfuscation 1 1 1Stringer obfuscation and optimization 1 1 1Proguard optimization 1 0.9838 0.9918Proguard optimization and obfuscation 1 0.9032 0.9491

activities". This practice was previously identi-fied in one mobile app [28]. Our results seems toconfirm that this is a frequent antipattern, butfurther investigation are needed to confirm thisassumption on a larger dataset. Furthermore,the proportions for these three antipatterns aresimilar to the one observed in non-mobile ap-plications by the DECOR approach [29]. How-ever, we detected less than 1% SAK, whereaswith the DECOR approach it is around 2.9%.The higher value of SAK is in Facebook (1.15%of classes), which is an application making alarge usage of interfaces as we discovered whenanalyzing our results. In conclusion, OO an-tipatterns are common in Android apps andthey are as frequent as in non-mobile applica-tions, except for SAK.

5.6.3 RQ3 To what extent mobile appsexhibit Android-specific antipat-terns?

The NLMR is the most frequent antipattern inproportion as it appears in all the applications(39.23% of all activities) except in the singleactivity of Video Chat Rooms. This high pres-ence can be explained by the fact that develop-ers are not aware of how Android manages thememory and that the method onLowMemory()exists. Another reason is that small mobileapps with less than 100 classes probably do nothave any cache or resources to free. Twitter,Facebook, and Skype are interesting cases be-cause they have very few NLMR compared tothe numbers of activities (around 15%). Thisshows that the memory management is consid-ered when these applications are developed andtherefore they are presenting a better qualityfor memory management.

In proportion, the LIC is the second mostfrequent antipattern (37.52% of all classes) and

it appears in all mobile apps. Even if the ratiois really high, we are not surprised by this ob-servation. The usage of inner and anonymousclasses is common in Java and Android appli-cations. Our assumption is that mobile devel-opers do not consider the performance impli-cations of inner classes, even in systems wherememory leaks are more problematic due to thelimited resources. Another explanation is thatthey cannot use static classes or weak referencesdue to implementations constraints. An innerclass may need to access the attributes of itsouter class.

The MIM is another frequent antipattern(19.02 % of all methods) for all the analyzedmobile apps. Our hypothesis is that developersare using static methods only when needed bythe implementation and not for efficiency pur-poses.

These antipatterns are detected by exploringthe model, thus their detection is not impactedby thresholds. Consequently, this values areindependent of the analyzed dataset and theirpresence is acknowledged. They are known toaffect the efficiency of mobile apps and thus,a refactoring focusing on the concerned classesand methods can improve the mobile app per-formance without any trade-off.

5.6.4 RQ4 Are OO and Android-specificantipatterns present in the sameproportion?

Our results shows that Android-specific an-tipatterns are more frequent than OO ones. Itis interesting to notice that the MIM, NLMRand LIC are all antipatterns, which have beenrecently defined for Android and thus they arenot yet really well referenced by the literatureand developer’s documentations, that may be acause of their strong presence. Also, one should

RR n° 8693

Detecting Antipatterns in Android Apps 17

note that the results on those antipatterns maygreatly vary between applications, for examplearound 93% of Adobe Reader activities do notimplement a method onLowMemory() whereasit is only around 15% for Twitter. Therefore,we assume that the developers practices are theroot cause to their presence.

5.6.5 Other observations

One can observe that Fitnet Apps and VideoChat Rooms - Chat.Org are the applicationswith the biggest proportion of antipatterns. Itis interesting to note that they are very smallmobile apps when looking at the numbers ofclasses and methods. We are observing thatvery small apps tend to have a higher propor-tion of antipatterns, however our sample is toosmall to validate this assumption statistically.Moreover, this correlation between the mobileapp complexity and the ratio of antipatternsdoes not seem to hold when comparing mediumand big applications. In order to confirm theassumptions we have made and to look for cor-relations between the applications characteris-tics, we are planning to conduct the thoroughstudy on a bigger sample containing thousandsof mobile apps.

5.7 Threats to validityIn this section, we discuss the threats to validityof our study based on the guidelines providedby Wohlin et al. [39].Construct validity threats concern the re-

lation between theory and observations. Inthis study, these threats could be due to er-rors during the analysis process. We validateour approach on a witness mobile app devel-oped to explicitly include 8 antipatterns and wechecked manually that the stored models, met-rics and antipatterns detected were correspond-ing to the source code even when obfuscationand optimization were used.Internal validity concerns our selection of

subject systems and analysis methods. Theusage of Boxplot to define thresholds couldbe criticized since it depends on the analyzeddataset and could have lead to other conclu-sions with other applications. However, only

half of the software antipatterns queries de-pends on these thresholds. Moreover, the usageof all the dataset to determine the thresholds isa guarantee that the thresholds are more rep-resentative than if they were arbitrary definedor calculated with only one application.External validity threats concern the possi-

bility to generalize our findings. We tried tocollect a wide diversity of mobile apps, andwe are thus thinking that they are significantand representative of what is distributed by theGoogle Play Store and installed on Android sys-tems. However, future studies should considerlarger sets of official Android apps. Moreover,the approach and results reported in this paperare specific to Android and cannot be general-ized to other systems, such as iOS or WindowsPhone.Reliability validity threats concern the pos-

sibility of replicating this study. We attemptto provide all the necessary details to replicateour experiments. In addition to the results pre-sented in this paper, the source code of ourmobile app as well as our graph databases areavailable online.

Finally, the conclusion validity threats referto the relation between the treatment and theoutcome. We paid attention not to make con-clusions that cannot be validated with the pre-sented results.

6 Conclusion and futurework

In this paper, we presented Paprika, an inno-vative tooled approach to detect antipatternsdirectly from Android packages. Paprika doesnot require the source code to perform thisdetection, and works even when the code hasbeen obfuscated to prevent reverse-engineering.Thus, Paprika is allowing the analysis of anyapplication available from app stores.

Our approach was applied on a witness appli-cation to validate its efficiency on the detectionof 4 OO antipatterns and 4 Android-specific an-tipatterns. This validation confirmed that Pa-prika is able to detect such antipatterns. Wealso performed a study on 15 popular applica-

RR n° 8693

Detecting Antipatterns in Android Apps 18

tions downloaded on the Google Play Store andfound significant results. We discovered thatantipatterns implemented in the source codecan be detected at the bytecode level (RQ1).Both types of antipatterns were found in allanalyzed applications. Results also show thatOO antipatterns are as present in Android ap-plications as in non-mobile applications, withthe exception of the Swiss army knife antipat-tern (RQ2). We also discovered that Android-specific antipatterns are highly frequent andcommon in all applications (RQ3), indeed theyare even more frequent that OO antipatterns(RQ4).

Concerning future work, we plan to imple-ment the detection of more antipatterns and toperform our analysis on a larger set of appli-cations. In this purpose, we already collectedthousands of applications from the Google PlayStore. We hope that this large-scale study willvalidate the tendencies we observed in this pa-per, but also that we will observe a correla-tion between the frequency of antipatterns andtheir metadata like the rating or the category.We will also study the evolution of antipat-terns between different versions of an applica-tion. Finally, we are planning to use our largedatabase of knowledge to discover new antipat-terns, which are existing in the wild but havenot been defined in the literature, yet.

Acknowledgements

These researches are co-funded by Universitéof Lille, Université du Québec à Montréal, In-ria, The Natural Sciences and Engineering Re-search Council of Canada (NSERC) and Fondsde recherche du Québec - Nature et technologies(FQNRT).

References

[1] Android performance tips. http://developer.android.com/training/articles/perf-tips.html. [Online;accessed November-2014].

[2] Android studio. https://developer.android.com/sdk/installing/studio.html. [Online; accessed November-2014].

[3] Android will account for 58com-manding a market share of 75https://www.abiresearch.com/press/android-will-account-for-58-of-smartphone-app-down.[Online; accessed November-2014].

[4] Dalvik bytecode. https://source.android.com/devices/tech/dalvik/dalvik-bytecode.html. [Online; ac-cessed November-2014].

[5] Java platform, micro edition (java me).http://www.oracle.com/technetwork/java/embedded/javame/index.html.[Online; accessed November-2014].

[6] Mobile applications futures 2013-2017. http://www.portioresearch.com/en/mobile-industry-reports/mobile-industry-research-reports/mobile-applications-futures-2013-2017.aspx. [Online; accessed November-2014].

[7] Pmd. http://pmd.sourceforge.net/.[Online; accessed November-2014].

[8] Smali: An assembler/disassembler forandroid’s dex format. https://code.google.com/p/smali. [Online; accessedNovember-2014].

[9] Cypher. http://neo4j.com/developer/cypher-query-language.[Online; accessed November-2014].

[10] Neo4j. http://neo4j.com. [Online; ac-cessed November-2014].

[11] Tools to work with android .dex and java.class files. https://code.google.com/p/dex2jar. [Online; accessed November-2014].

[12] Google play store. https://play.google.com, 2014. [Online; accessedNovember-2014].

RR n° 8693

Detecting Antipatterns in Android Apps 19

[13] K. Aggarwal, Y. Singh, A. Kaur, andR. Malhotra. Empirical analysis for inves-tigating the effect of object-oriented met-rics on fault proneness: a replicated casestudy. Software process: Improvement andpractice, 14(1):39–62, 2009.

[14] A. Bartel, J. Klein, Y. Le Traon, andM. Monperrus. Dexpler: converting an-droid dalvik bytecode to jimple for staticanalysis with soot. In Proc. of theACM SIGPLAN International Workshopon State of the Art in Java Program anal-ysis, pages 27–38. ACM, 2012.

[15] V. R. Basili, L. C. Briand, and W. L.Melo. A validation of object-oriented de-sign metrics as quality indicators. IEEETransactions on Software Engineering,22(10):751–761, 1996.

[16] W. J. Brown, H. W. McCormick, T. J.Mowbray, and R. C. Malveau. AntiPat-terns: refactoring software, architectures,and projects in crisis. Wiley New York, 1.auflage edition, 1998.

[17] M. Brylski. Android smells cata-logue. http://www.modelrefactoring.org/smell_catalog, 2013. [Online; ac-cessed November-2014].

[18] S. R. Chidamber and C. F. Kemerer. Ametrics suite for object oriented design.IEEE Transactions on Software Engineer-ing, 20(6):476–493, 1994.

[19] W. Enck, D. Octeau, P. McDaniel, andS. Chaudhuri. A study of android appli-cation security. In USENIX security sym-posium, volume 2, page 2, 2011.

[20] M. Fokaefs, N. Tsantalis, E. Stroulia, andA. Chatzigeorgiou. Jdeodorant: identifica-tion and application of extract class refac-torings. In Proceedings of the 33rd Interna-tional Conference on Software Engineer-ing, pages 1037–1039. ACM, 2011.

[21] M. Fowler. Refactoring: improving the de-sign of existing code. Pearson EducationIndia, 1999.

[22] F. Holzschuher and R. Peinl. Performanceof graph query languages: comparison ofcypher, gremlin and native access in neo4j.In Proc. of the Joint EDBT/ICDT 2013Workshops, pages 195–204. ACM, 2013.

[23] W. Li and R. Shatnawi. An empiri-cal study of the bad smells and class er-ror probability in the post-release object-oriented system evolution. Journal of sys-tems and software, 80(7):1120–1128, 2007.

[24] M. Linares-Vásquez, S. Klock, C. McMil-lan, A. Sabané, D. Poshyvanyk, and Y.-G. Guéhéneuc. Domain matters: bring-ing further evidence of the relationshipsamong anti-patterns, application domains,and quality-related metrics in java mobileapps. In Proc. of the 22nd InternationalConference on Program Comprehension,pages 232–243. ACM, 2014.

[25] A. Lockwood. How to leak a context:Handlers and inner classes. http://www.androiddesignpatterns.com/2013/01/inner-class-handler-memory-leak.html, 2013. [Online; accessed November-2014].

[26] C. Marinescu, R. Marinescu, P. F. Mi-hancea, and R. Wettel. iplasma: An inte-grated platform for quality assessment ofobject-oriented design. In In ICSM (In-dustrial and Tool Volume). Citeseer, 2005.

[27] T. J. McCabe. A complexity measure.IEEE Transactions on Software Engineer-ing, (4):308–320, 1976.

[28] R. Minelli and M. Lanza. Software ana-lytics for mobile applications–insights andlessons learned. In 17th European Confer-ence on Software Maintenance and Reengi-neering (CSMR), pages 144–153. IEEE,2013.

[29] N. Moha, Y.-G. Guéhéneuc, L. Duchien,and A. Le Meur. Decor: A method forthe specification and detection of code anddesign smells. Software Engineering, IEEETransactions on, 36(1):20–36, 2010.

RR n° 8693

Detecting Antipatterns in Android Apps 20

[30] J. Reimann, M. Brylski, and U. Aß-mann. A Tool-Supported Quality SmellCatalogue For Android Developers. InProc. of the conference Modellierung2014 in the Workshop Modellbasierte undmodellgetriebene Softwaremodernisierung– MMSM 2014, 2014.

[31] J. Reimann, M. Seifert, and U. Aßmann.On the reuse and recommendation ofmodel refactoring specifications. Software& Systems Modeling, 12(3):579–596, 2013.

[32] I. J. M. Ruiz, M. Nagappan, B. Adams,and A. E. Hassan. Understanding reusein the android market. In 20th Interna-tional Conference on Program Comprehen-sion (ICPC), pages 113–122. IEEE, 2012.

[33] M. Schönefeld. Reconstructing dalvik ap-plications. In 10th annual CanSecWestconference, 2009.

[34] Y. Singh, A. Kaur, and R. Malhotra. Em-pirical validation of object-oriented met-rics for predicting fault proneness models.Software quality journal, 18(1):3–35, 2010.

[35] N. Tsantalis, T. Chaikalis, and A. Chatzi-georgiou. Jdeodorant: Identification andremoval of type-checking bad smells. InSoftware Maintenance and Reengineering,2008. CSMR 2008. 12th European Confer-ence on, pages 329–331. IEEE, 2008.

[36] J. W. Tukey. Exploratory Data Analysis.Addison-Wesley, 1977.

[37] R. Vallée-Rai, P. Co, E. Gagnon, L. Hen-dren, P. Lam, and V. Sundaresan. Soot-a java bytecode optimization framework.In Proc. of the conference of the Centrefor Advanced Studies on Collaborative re-search, page 13. IBM Press, 1999.

[38] D. Verloop. Code Smells in the Mobile Ap-plications Domain. PhD thesis, TU Delft,Delft University of Technology, 2013.

[39] C. Wohlin, P. Runeson, M. Höst, M. C.Ohlsson, B. Regnell, and A. Wesslén.Experimentation in software engineering.Springer, 2012.

[40] L. Xu. Techniques and Tools for Analyzingand Understanding Android Applications.PhD thesis, University of California Davis,2013.

RR n° 8693

RESEARCH CENTRELILLE – NORD EUROPE

Parc scientifique de la Haute-Borne40 avenue Halley - Bât A - Park Plaza59650 Villeneuve d’Ascq

PublisherInriaDomaine de Voluceau - RocquencourtBP 105 - 78153 Le Chesnay Cedexinria.fr

ISSN 0249-6399