fraudroid: an accurate and scalable approach to automated mobile ad fraud detection · fraudroid:...

13
FrauDroid: An Accurate and Scalable Approach to Automated Mobile Ad Fraud Detection Feng Dong *† , Haoyu Wang *‡ , Yuanchun Li § , Yao Guo § , Li Li , Shaodong Zhang *† , Guoai Xu *† * School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, China NEL of Security Technology for Mobile Internet, Beijing University of Posts and Telecommunications, Beijing, China Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing, China § Key Lab of High-Confidence Software Technologies (MOE), Peking University, Beijing, China University of Luxembourg, Luxembourg City, Luxembourg {dongfeng,haoyuwang,zhangsd,xga}@bupt.edu.cn, {liyuanchun,yaoguo}@pku.edu.cn, [email protected] Abstract—Previous studies on mobile ad fraud detection are limited to detecting certain types of ad fraud (e.g., placement fraud and click fraud). Dynamic interactive ad fraud has never been explored by previous work. In this paper, we propose an explorative study of ad fraud in Android apps. We first created a taxonomy of existing mobile ad fraud. Besides static placement fraud, we also created a new category called dynamic interactive fraud and found three new types of them. Then we propose FrauDroid, a new approach to detect ad fraud based on user interface (UI) state transition graph and networking traffic. To achieve accuracy and scalability, we use an optimized exploration strategy to traverse the UI states that contain ad views, and design a set of heuristic rules to identify kinds of ad fraud. Experiment results show that FrauDroid could achieve a precision of 80.9% and it is capable of detecting all the 7 types of mobile ad fraud. We then apply FrauDroid to more than 80,000 apps from 21 app markets to study the prevalene of ad fraud. We find that 0.29% of apps that use ad libraries are identified containing frauds, with 23 ad networks suffering from ad fraud. Our finding also suggests that Dynamic interactive fraud is more prevalent than static placement fraud in real world apps. Index Terms—Android, ad fraud, automation, user interface, mobile app I. I NTRODUCTION Mobile advertising is the main way to earn profit for most app publishers. Advertisement libraries are used in almost two-thirds of the popular apps from Google Play [1]. Mobile advertising has become the most important support for the Android ecosystem [2]. App developers get revenue from advertisers based on the number of advertisements (ads for short) displayed (called impressions) [3] or clicked by users (called clicks). However, current mobile advertising is plagued by various types of frauds [4] [5]. Malicious developers could cheat advertisers and users using fake or unintentional ad impressions/clicks. For example, mobile publishers could typically employ in- dividuals or use bot networks to drive fake impressions and clicks to earn profit [6]. It is estimated that mobile advertisers lose up to 1.3 billion US dollars due to ad fraud in 2015 [7]. Ad fraud in mobile advertising has become the next battleground for advertisers and researchers. Ad fraud in web advertising has been extensively studied. Most of them focus on click fraud (also known as bot- driven fraud), which uses individuals or bot networks to drive fake or undesirable impressions and clicks. These studies include detecting click fraud based on networking traffic [8] [9] or searching engine query logs [10], characterizing click fraud [11] [12] [13] and analyzing its profit model [14]. Very few studies exist in exploring ad fraud in mobile apps. One relevant work DECAF [4] was proposed to de- tect placement frauds that manipulate visual layouts of ad views (also called elements or controls) to trigger undesirable impressions in Windows Phone Apps. They explored the UI states (referring to the running pages of the apps) to detect ad placement frauds, e.g., hidden ads, many ads in one page, etc. MAdFraud [6] was proposed to detect click frauds in Android apps by analyzing networking traffic. However, with the evolution of ad fraud, the deception techniques used by mobile publishers are getting more and more sophisticated, especially when the characteristics of mobile apps are considered. Besides placement fraud and click fraud mentioned earlier, many apps use sophisticated ways to entice user to click the ad view during user’s interaction with the app, which are however not expected by the user. We call them dynamic interactive fraud. Figure 1 shows an motivating example. The app (com.android.yatree.taijiaomusic) pops up an ad view upon the exit button unexpectedly when user wants to exit the app, which could cause an undesirable click. Actually, this app misleads 9 of 10 users to click the ad view in our study. Along with previous studies, we divide mobile ad frauds into two categories: 1) static placement fraud. The fraud occurs in a single UI state and the behavior could be determined by static information, such as the ad view size, location, and the number of ad views, etc. 2) dynamic interactive fraud. The fraud involves multiple UI states and the fraud occurs during user’s interaction with the app at runtime. In this paper, we propose an explorative study of ad fraud in Android apps. We first created a taxonomy of existing mobile arXiv:1709.01213v2 [cs.CR] 6 Sep 2017

Upload: nguyenkien

Post on 14-Apr-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

FrauDroid: An Accurate and Scalable Approach toAutomated Mobile Ad Fraud Detection

Feng Dong∗†, Haoyu Wang∗‡, Yuanchun Li§, Yao Guo§, Li Li¶, Shaodong Zhang∗†, Guoai Xu∗†∗School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing, China

†NEL of Security Technology for Mobile Internet, Beijing University of Posts and Telecommunications, Beijing, China‡Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia, Beijing, China§Key Lab of High-Confidence Software Technologies (MOE), Peking University, Beijing, China

¶University of Luxembourg, Luxembourg City, Luxembourg{dongfeng,haoyuwang,zhangsd,xga}@bupt.edu.cn,{liyuanchun,yaoguo}@pku.edu.cn, [email protected]

Abstract—Previous studies on mobile ad fraud detection arelimited to detecting certain types of ad fraud (e.g., placementfraud and click fraud). Dynamic interactive ad fraud has neverbeen explored by previous work. In this paper, we propose anexplorative study of ad fraud in Android apps. We first createda taxonomy of existing mobile ad fraud. Besides static placementfraud, we also created a new category called dynamic interactivefraud and found three new types of them. Then we proposeFrauDroid, a new approach to detect ad fraud based on userinterface (UI) state transition graph and networking traffic. Toachieve accuracy and scalability, we use an optimized explorationstrategy to traverse the UI states that contain ad views, and designa set of heuristic rules to identify kinds of ad fraud. Experimentresults show that FrauDroid could achieve a precision of 80.9%and it is capable of detecting all the 7 types of mobile ad fraud.We then apply FrauDroid to more than 80,000 apps from 21 appmarkets to study the prevalene of ad fraud. We find that 0.29%of apps that use ad libraries are identified containing frauds,with 23 ad networks suffering from ad fraud. Our finding alsosuggests that Dynamic interactive fraud is more prevalent thanstatic placement fraud in real world apps.

Index Terms—Android, ad fraud, automation, user interface,mobile app

I. INTRODUCTION

Mobile advertising is the main way to earn profit for mostapp publishers. Advertisement libraries are used in almosttwo-thirds of the popular apps from Google Play [1]. Mobileadvertising has become the most important support for theAndroid ecosystem [2].

App developers get revenue from advertisers based on thenumber of advertisements (ads for short) displayed (calledimpressions) [3] or clicked by users (called clicks). However,current mobile advertising is plagued by various types offrauds [4] [5]. Malicious developers could cheat advertisersand users using fake or unintentional ad impressions/clicks.For example, mobile publishers could typically employ in-dividuals or use bot networks to drive fake impressions andclicks to earn profit [6]. It is estimated that mobile advertiserslose up to 1.3 billion US dollars due to ad fraud in 2015 [7]. Adfraud in mobile advertising has become the next battlegroundfor advertisers and researchers.

Ad fraud in web advertising has been extensively studied.Most of them focus on click fraud (also known as bot-driven fraud), which uses individuals or bot networks to drivefake or undesirable impressions and clicks. These studiesinclude detecting click fraud based on networking traffic [8][9] or searching engine query logs [10], characterizing clickfraud [11] [12] [13] and analyzing its profit model [14].

Very few studies exist in exploring ad fraud in mobileapps. One relevant work DECAF [4] was proposed to de-tect placement frauds that manipulate visual layouts of adviews (also called elements or controls) to trigger undesirableimpressions in Windows Phone Apps. They explored the UIstates (referring to the running pages of the apps) to detect adplacement frauds, e.g., hidden ads, many ads in one page, etc.MAdFraud [6] was proposed to detect click frauds in Androidapps by analyzing networking traffic.

However, with the evolution of ad fraud, the deceptiontechniques used by mobile publishers are getting more andmore sophisticated, especially when the characteristics ofmobile apps are considered. Besides placement fraud and clickfraud mentioned earlier, many apps use sophisticated ways toentice user to click the ad view during user’s interaction withthe app, which are however not expected by the user. We callthem dynamic interactive fraud. Figure 1 shows an motivatingexample. The app (com.android.yatree.taijiaomusic) pops upan ad view upon the exit button unexpectedly when userwants to exit the app, which could cause an undesirable click.Actually, this app misleads 9 of 10 users to click the ad viewin our study.

Along with previous studies, we divide mobile ad fraudsinto two categories:

1) static placement fraud. The fraud occurs in a singleUI state and the behavior could be determined by staticinformation, such as the ad view size, location, and thenumber of ad views, etc.

2) dynamic interactive fraud. The fraud involves multipleUI states and the fraud occurs during user’s interactionwith the app at runtime.

In this paper, we propose an explorative study of ad fraud inAndroid apps. We first created a taxonomy of existing mobile

arX

iv:1

709.

0121

3v2

[cs

.CR

] 6

Sep

201

7

!"#$%&"

'((%')*"

+,%-(%./%!01

Fig. 1. An example of explicit UI transition fraud.

ad fraud. Besides static placement fraud (hidden fraud, sizefraud, number fraud, overlap fraud), we also created a newcategory called dynamic interactive fraud that has never beenexplored in previous work. These frauds relate to how theapps mislead users to trigger ad impressions during user’sinteraction with the app at runtime. We have identified threenew types of dynamic interactive frauds, including Explicit UITransition Fraud that intentionally pops up an ad view duringusers’ interaction with the app to get accidental clicks, ImplicitUI Transition fraud that downloads apps or files withoutusers’ interaction after ad is clicked, and Pop-up WindowFraud that pops up an ad view outside of the app environment.Note that click fraud on Android has been well studied inprevious work [6] [15], this kind of fraud is not in the scopeof this taxonomy. Because click fraud is mainly about fakeimpressions/clicks, while in this paper we mainly focus onthe frauds that entice users to click. So the state-of-the-arthas just explored the straightforward frauds (e.g., placement,click), while in this work we go one step further to tackle themore challenging ones, that is, to pinpoint such frauds thatentice users to click.

Unlike static placement fraud detection that only needs toanalyze the visual layout of ads in a single state [4], we faceseveral new challenges to detect these kinds of new ad fraud:

1) UI state transition. A UI state is a running pagethat contains many visual views (also called elements)which refer to controls in Android. To detect dynamicinteractive fraud, we should consider not only the currentUI state, but also the relationship between UI statesand even analyze the background networking traffic. Todetect ad fraud in the motivating example (Figure 1),we should analyze both current UI state and the next UIstate to check whether there is an ad view placed uponthe button that will entice users to click unexpectedly.One challenge in here is how to trigger ad views andhow to achieve scalability and state coverage in UIstate traversing? Previous work suggested that it takesseveral hours to traverse all the states of an app basedon Android automation framework [16].

2) Ad view detection. Different from the Windows Phoneplatform that previous work [4] focused on, it is hard toidentify an ad view from a given UI state, because nolabels are provided to distinguish it from other views.During app development, views can be added into theactivity by either specifying it in the xml layout orembedding it in the source code. Most ad views areembedded in the code, which makes it not feasible toidentify ad views based on the xml layout. How todistinguish the ad view from a large number of normalviews in a given app state?

3) Detecting multiple kinds of ad fraud. We should explorethe characteristic of different ad frauds, and the FraudChecker we designed should capture enough features toidentify different ad fraud behaviors, especially for theones that have never been studied by previous work.What kinds of features should we collect and how dowe collect them?

To address these challenges, we propose FrauDroid, a newapproach to detect ad frauds based on UI transition graphsand networking traffic. We introduce two key techniques toachieve accuracy and scalability in ad fraud detection.

1) We propose an automation technique to traverse theUI states that contain ad views in Android apps. Tocapture the relationship between UI states, we simulateuser interactive events, build the UI transition graph andmonitor networking traffic at runtime. To distinguish thead view from a large number of normal views in a givenapp state, we collected a large amount of ad views andnormal views, compared them from various aspects (e.g.,placement features, string features, etc.) and trained adetector to identify ad views. To achieve scalability in UIstate traversing, we optimize the UI exploration strategybased on the findings that more than 80% of the UI statesdo not contain ad views, and ad views usually appear atthe start and end of a UI state.

2) We design a set of heuristic rules based on the character-istic of different ad frauds. We capture various behaviors(e.g., size, bounds, string, networking traffic, etc.) of adview at runtime to check against these rules. For themotivating example (Figure 1), the Fraud Checker firstidentifies the state that contains ad view, then it traversesthe state transition graph. If an ad view click event or anad close button click event is found, and the next stateis an interactive view such as an exit dialog, it will labelthe previous state as interactive fraud.

We have implemented a prototype system to detect ad fraudin Android apps. Experiments show that FrauDroid couldachieve a precision of 80.9% and it is capable of detecting allthe 7 types of Android ad frauds. We have applied FrauDroidto a large dataset containing 80,539 apps crawled from 21 appmarkets including Google Play and other major third-partymarkets. Based on extensive experiments on smartphones, wehave identified 92 ad fraud apps, roughly accounted for 0.3%of the apps that use ad libraries in our study. For some

!!"

#$%&'()*+

,

-*./0+1

,2*+.'(*+

3(*+

45 ,"607.*7.

!!"/'.)

8,"&'%+8+9

:5#$."87" ,

;5 ,"<**,%861

=5>*." ,"?@A

B5#$%&'()" !! C5#89"D0+" ,E5 ,"<**,%861

F5 ,"<**,%861

G5#89"D0+" ,

Fig. 2. An overview of mobile advertising ecosystem.

app markets such as PP Assistant1, more than 2.65%of the apps that use ad libraries are detected containing adfrauds. We also find 5 ad fraud apps from Google Play, whichdemonstrates that mobile ad fraud is prevalent in app markets.

We make the following main research contributions:1) We create a taxonomy of existing mobile ad frauds. Be-

sides placement fraud (hidden fraud, size fraud, numberfraud, overlap fraud), we also identified a new categorycalled dynamic interactive fraud (explicit UI transitionfraud, implicit UI transition fraud, pop-up window fraud)that has never been explored by previous studies.

2) We propose FrauDroid, a new approach to detect mobilead fraud based on UI transition graph and networkingtraffic. To the best of our knowledge, FrauDroid is thefirst work that could detect all the 7 kinds of ad fraudin Android apps.

3) We have applied FrauDroid to 80,000 apps from 21app markets. Experiment results show that FrauDroidis able to detect ad fraud with acceptable accuracy andscalability. We found 92 ad fraud apps, even some ofthem are popular apps that have millions of downloads.

II. BACKGROUND

A. Mobile Advertising

We first give a brief introduction to the mobile adver-tising ecosystem. There are four key roles involved in theadvertising ecosystem: advertiser, advertising network, apppublisher and mobile user. Advertiser is a role who advertiseson the advertising network and pays for it. Advertising net-work (ad network) is a trusted intermediary platform whichconnects advertiser and app publisher. App publisher (alsocalled developer) is the one who releases apps to the appmarkets. Mobile user downloads apps from the app marketsand uses it on smartphones. Figure 2 shows the workflowof mobile advertising ecosystem. In mobile platform, mostads are embedded in the apps in the form of ad library thatcould fetch and display ads at runtime. Advertisers put anad on the ad network, then app publishers get and embed ad

1https://www.25pp.com/android/

library from ad network. Users download the app and run iton smartphone. The ad library fetches ad content and sendsfeedback (impressions and clicks) to the ad network afterthe app gets started. Advertisers pay ad networks and apppublishers based on impression count or click count.

However, ad fraud is threatening the mobile advertisingecosystem. Some ad networks are aware of the threat ofad fraud and have published policies and guidelines (calledprohibitions) on how ad views should be placed or usedin app. For example, Google Admob publishes admob pro-gram policies that all app publishers are required to followit [17]. All violations of the policies are regarded as adfrauds. Despite these prohibitions, mobile app publishers haveincentive to commit such frauds. Ad fraud endangers all themobile advertising ecosystem. This behavior cheats mobileadvertiser, reduces the credibility of ad network, and evengreatly degrades user experience.

B. Mobile Ad Fraud

Ad fraud in web advertising has been extensively studiedbut there is very little research on mobile ad fraud.

Previous studies mainly focus on two categories of mobilead fraud: click fraud and static placement fraud. MAdFraud [6]and ClickDroid [15] were proposed to detect click fraud forAndroid apps based on networking traffic analysis. MAdFraudused techniques that analyzed the ad request pages to iden-tify if publishers were requesting ads while the app was inthe background or clicking ads without user’s interaction.ClickDroid empirically evaluated click fraud security risksof eight popular ad networks. DECAF [4] was proposed toautomatically detect placement fraud on Windows phone byanalyzing the visual layouts of ad views, but it can not beapplied to Android apps directly. Placement fraud detection onAndroid remains unstudied, let alone the dynamic interactivefraud. To the best of our knowledge, this paper is the firstwork to study both placement fraud and dynamic interactivefraud targeting Android platform.

In this paper, we created a taxonomy of mobile ad fraudsbased on behavior policies of popular ad libraries [17] [18]and a manually labeled dataset of ad fraud apps. Besidesstatic placement fraud, we also introduce a new category calleddynamic interactive fraud.

1) Explicit UI Transition Fraud: Popping up an ad viewduring users’ interaction with the app, which could leadto undesirable click. App publishers use this trick to getmore accidental clicks. The policies of Google AdMobrequire that “Do not place interstitial ads on app loadand when exiting apps as interstitials should only beplaced in between pages of app content” [17].

2) Implicit UI Transition Fraud: Different from click fraudwhich uses bot networks to drive fake clicks, implicitUI transition fraud triggers unsuspicious downloadingbehaviors (apps or files) without user’s consent afteruser clicks the ads. Criteria for mobile ad stipulatesthat “The behavior that directly download apps or fileswithout user’s interaction after clicking the ad is a

!"#$%&'

($) *!+'),-./&!"'*&0)1'!"'234

567)8202!"'*03)9:)1'"'2)

;!"0+*'*&0)8!".<

5=7)(#'&>"'2$)($) !"#$)

?2'2@'*&0

5A7)12/2@'*03)(..+)'<"')9+2)

($)B*C!"!*2+))

1'"'2)(

1'"'2)D1'"'2)%

)D&0+'!#@'*03)E*2F);!22

E*2F)'!22)(

1'"'2)GGG

,H20')A,H20')6

!"#$)D<2@I2!,-./*@*' :>./*@*' J&.K#.)F*0$&F

L*$$20) 1*M2) NH2!/".) O#>C2!)

1'"'2)(

(..)*+

P!"#$#/20')

&!)0&'

!"#$)'4.2

1'"'2)% !"#$)'4.2

1'"'2)D !"#$)'4.2

1'"'2)GGG !"#$)'4.2

J2!>*++*&0)P*/'2!

($)/*C!"!4

P*/'2!

(..+)@!"F/2$)P!&>)

"..)>"!I2'+

(..+)'<"')#+2)"$)

/*C!"!*2+

($)/*C!"!4)

1'"'2)(

1'"'2))%

1'"'2)D

1'"'2)GGG

9:)+'"'2)

'!"0+*'*&0)3!".<

($)E*2F)?2'2@'&!1'!*03)

2"'#!2+

J/"@2>20')

2"'#!2+

;4.2)

2"'#!2+

Fig. 3. Overview of FrauDroid.

mandatory download” [19]. Besides, most of the down-loading behaviors in implicit UI transition fraud cannotbe canceled. The downloading behavior will cost user’smoney and greatly degrade user experience.

3) Pop-up Window Fraud: Popping up an ad view afterexiting the app. The behavior policies of Google Admobregulate that ads should not be placed in the apps thatare running in the background or outside of the appenvironment [17]. Pop-up window fraud could seriouslydegrades user experience because the popping up adusually annoys mobile users in many ways. For example,it stays in the home screen and covers icons of appswhich you want to click. It may show up unscheduledto harass you, and you can not end it before you findout that the ad belongs to which app.

4) Hidden Fraud: Ads are hidden behind other controls.The behavior policies of Google Admob regulate thatads should not be placed very close to or underneathbuttons or any other object which users may accidentallyclick while interacting with your app [17]. App publish-ers hide ads to give users the illusion of an “ad-free app”,in order to provide a better experience without ads. Butads in hidden fraud do not play a promotional role as anormal ad do, resulting in the loss of advertisers.

5) Size Fraud: Ads are either too small or too largefor users to read. Many ad networks give advices todevelopers about the size of different types of ads.For example, developers that use AdMob library couldimplement a number of different ad formats, dependingon the platform (e.g. mobile vs. tablet) [17]. The size ofstandard banner ad is set to 320*50 [20]. Although the

ad size advice given by ad networks are not mandatoryand there is no fixed value for ad size, the size ratiobetween the ad and the screen is in a relatively stablerange so that the ad can be viewed normally by users.Most ads in size fraud are too small so that app pub-lishers make the app an “ad-free app”. The “large size”fraud can get a greater probability of attention and click,but it affects user visual experience.

6) Number Fraud: An app page displays too many ads.DoubleClick ad program policies [18] stipulate that“Number of ads permitted to display on a page do notexceed the amount of site content”. In general, morethan three ads in one single state will be consideredunacceptable for users. It will seriously affect the normalfunction of apps if the ad content exceeds the amountof the app content.

7) Overlap Fraud: Ads are overlapped with actionableviews. The behavior policies of Google Admob regulatethat ads should not be placed in a location that coversup or hides any area that users have interest in viewingduring typical interaction [17]. App publishers use over-lap fraud to trigger undesirable impressions and clickswhich may annoy users.

In order to detect all the seven kinds of ad fraud, wehave implemented a tool called FrauDroid which combinesanalyzing background networking traffic and the visual layoutsof ad views to automatically detect ad fraud in Android apps.

III. FRAUDROID OVERVIEW

The overall architecture of FrauDroid is shown in Fig-ure 3. To detect ad fraud efficiently, we first use a

combined approach to identify apps that use ad librariesfrom massive apps. On one hand, apps that use ad li-braries always need several permissions to load and dis-play ads, including “android.permission.INTERNET” and“android.permission.ACCESS NETWORK STATE”. On theother hand, we take advantage of LibRadar [21], anobfuscation-resilient tool to detect third-party libraries usedin Android apps. For the apps that use ad libraries, FraudBotautomated runs the app on smartphone and generates the UIState Transition Graph, which records the layout informationand networking traffic for each UI state. The relationship be-tween UI states and corresponding event transition informationare also recorded to restore the whole app execution process.Finally, Ad View Detector identifies the UI state that containsad views and Fraud Checker uses a set of heuristic rules toidentify ad fraud by analyzing the layout information andbackground networking traffic of each state.

IV. GENERATING UI STATE TRANSITION GRAPH

One of the key idea behind our work is that manipulatingthe visual layout of ad views in a mobile app can be automateddetected by traversing all the UI states. So we use the auto-mated input generation techniques to get the UI informationfor ad fraud detection.

However, it is time-consuming to go through and record allthe UI states of an app, which makes this method unsuitablefor large-scale app analysis. In our prior experiment, it tookseveral hours to traverse all the states of an app. Our key ideais based on the finding that more than 90% of the UI statesdo not contain any ad views. Thus, we propose FraudBot,an automated scalable tool to trigger the UI states of anapp, especially the states that contain ad views. FraudBottakes advantage of a more sophisticate exploration strategy topreferentially choose the suspicious ad views first for detectingad fraud. Besides, we monitor the networking traffic to helpidentify unsuspicious downloading behavior in backgroundthat cannot be detected based on the UI states, which couldhelp detect the implicit UI transition fraud.

FraudBot adopts a state transition model and use ad firstexploration strategy to traverse as much as UI states thatcontain ad views within a limited time. Figure 4 showsan example of the state transition model. Basically, It is adirected graph, in which each node represents an app state,and each edge between two nodes represents the test inputevent that triggers the state transition. Each state node recordsthe networking traffic, the running process information andview tree information. Each event edge contains the details ofthe test input such as the event type, the view class, the sourcestate, etc. The state transition graph is constructed on the fly.FraudBot gets the views from the current state and generatesa event input based on the ad first exploration strategy. Oncethe app state is changed, the new state and the event are addedinto the graph. The UI state transition graph can restore thewhole app execution process for ad fraud detection.

!"!#$% !"!#$&

!"!#$'

()#*!$+

()#*!$,

()#*!$-

.#!/0123*4$!1"5536#7

!"#"$% &

83#/$!1##7$

'''()83#/$+

'''''' *+,-./0 1'2&'

'''''' -34+5* 1'5677&'

'''''' 4+8964:+./0 1' ;<.=> &'

'''''' *+?* 1'5677&'

'''''' %96508 1'((2&'2@&'(A2B2&'ACCD@@&'

'''''' :7388 1

:9,E35049/0E/5*+4537E-97/:FE/,-7EGH95

+I/509JK>+:94L/+J &'

'''''' :H/704+5 1'(A@&'

'''''' 8/M+ 1' A2B2NACCD O&'

)83#/$,EEEO&EEE@

9106#::$3*507

P:*/Q/*F1'*3/R/39,68/:

S+4Q/:+81'I3-G68HT353U+4

????

!"!#$%

;

0<16#$:!"!# 1' V2ACW2"W2B.AD2AA2 &

=#:!3*"!30*$:!"!# 1' V2ACW2"W2B.AD2AVB &

''''' ()#*!>!?@# 1' *96:H

' 6A":: 1' 35049/0EJ/0U+*EX6**95 &

1#:0<16#>3B 1' /0Y2?BCD#Z2# &

$$$$$$CCC

D()#*!$+

Fig. 4. An example of the state transition graph. Note that the data in thisgraph is simplified for easy understanding.

A. Ad First Exploration Strategy

We developed an Ad First Exploration Strategy in FraudBotto traverse the UI states that contain ad views consideringthe balance of coverage and time efficiency. Mobile userscould trigger events (e.g., clicking, pressing) to interact withUI components such as button, scroll, etc. Thus FraudBotgenerates these events based on current state to simulate realuser behaviors.

Traversing all the UI states of an app takes an average ofmore than one hour, which is unsuitable for scalable ad frauddetection [4]. Thus we propose an ad first exploration strategyto traverse the states that contain ad views to achieve thebalance between UI coverage and time efficiency. The coreidea of the strategy is that we only traverse the main UI stateand the exit UI state and take priority to click ad views locatedin these two states. Basically, FraudBot takes a a breadth-firstsearch (BFS) algorithm and reorders the views acquired fromeach state according to ad features described in Section V-Aand an ad first rule. In our preliminary experiments, we foundthat more than 90% of the ads displayed in the main UI stateand the exit UI state. Thus we set the traversal depth to 1,which means that FraudBot only explores the views in themain UI state.

FraudBot checks whether the initial state is the main UIstate first. If it is a splash state (usually containing some

!"#$%&'$!('$)*+!,'#*"%,"-&.+

/!"#$%&'$!0'*(!1'*(2+.3 4!"#$%&'$!('$)*+!5%"6*,"-&.+

7!"#$%&'$!('$)*+!,'#*"%,"-&.+

8!"#$%&'$!('$)*+!1'*(59'::*%

;!<&6!:&:!'=!&

>!"#$%&'$!('$)*+!?*9"+'0*,"-&.+

2+"+*@1'*(@A%**@

B!<&6!"#$%&'$!'#+*%#"9!:&9'<-!'6:9!CD&#*E'#$&(FG*<&%1'*(

HI<&6!:&:!'=!<J

KI<&6!:&:!'=!"% BI"#$%&'$!('$)*+!L.++&# I"#$%&'$!('$)*+!A*M+1'*(

NO+*6:P'$OI@KQ@

@@@@@@O:"%*#+OI@HQ@

@@@@@@O%*=&.%<*P'$OI@O'$RBMH8>7;B4OQ@

@@@@@@O+*M+OI@#.99Q@

@@@@@@O0'*(P=+%OI@

O<9"==I<&6!:&:!'=!"%Q%*=&.%<*P'$I'$R

BMH8>7;B4Q='S*IH BTH BQ+*M+IU&#*OQ@

@@@@@@O*#"39*$OI@+%.*Q@

@@@@@@O3&.#$=OI@VV 47Q@7/BWQ@

@@@@@@@@VK;7Q@ 44BWWQ@

@@@@@@O<9"==OI@O<&6!:&:!'=!"%OQ@

@@@@@@OX&<.=*$OI@X"9=*Q@

@@@@@@O<D'9$%*#OI@VWQ@

@@@@@@O='S*OI@OH BTH BOYQ@

KI<&6!:&:!'=!"%

Fig. 5. An example of view tree for explicit UI transition fraud.

tutorial pages) or an ad state, the strategy sends a back eventor a close event to get to the main UI state of the app.Then, it gets the views of the main UI state and identifiesthe ad views based on the string features, type features andplacement features. FraudBot prioritizes the views and adviews will be triggered first. At last, it exits the app afterit clicks all the views. Considering that loading an ad viewtakes time, the transition time between two states is set to 5seconds according to the average ad view loading time in ourexperiment network environment. By using ad first explorationstrategy with traversal depth as 1, we reduce the traversingtime from one hour per app to three minutes per app onaverage, which could effectively improve the efficiency of ourapproach.

B. Constructing View Tree

Figure 5 shows an example of the view tree, which isdetected as explicit UI transition fraud. The root is the groundlayout view which contains several upper views. Each view canbe treated as a node in the tree that has some basic propertiessuch as position, size, class name, etc. Note that each parentnode is a container of child nodes, and only leaf nodes caninteract with users. So we only focus on leaf nodes, becauseonly the leaf node is likely to be ad views.

The attributes information of leaf nodes could be used toidentify ad views. As shown in Figure 5, the leaf node withtag 9 has a self-developed class name “com.pop.is.ar”, whichis a string feature to be an ad view candidate. Besides, it hasa bound value of {[135, 520], [945, 1330]} meaning that itlocates in the center of the screen, and a size value of 810*810

that fits an interstitial ad size. Then it can be confirmed as an adview combined the string features and the placement features(see Section V-A).

Besides, layout information is not enough to detect implicitUI transition fraud. Downloading behaviors cannot be identi-fied based on the attributes of views. Our approach is based onthe finding that background downloading will increase unusualnetworking traffic. So we monitor the networking traffic oftarget app at runtime. Once the networking traffic increasesgreatly, we will check the downloading behaviors and analyzewhether it is triggered by unsuspicious app downloading.

V. AUTOMATED AD FRAUD DETECTION

FrauDroid detects ad fraud automatically by analyzing thelayout information and networking traffic stored in the UI statetransition graph. It consists of two key components: Ad ViewDetector and Fraud Checker.

A. Ad View Detector

As we mentioned earlier in this paper, it is hard to identifyad views in Android app. To differentiate the ad views from alarge number of normal views in a given UI state, we manuallylabeled a large mount of ad views and normal views, comparedthem from various aspects (e.g., string features, type features,placement features, etc). Then we use a combined heuristicapproach to detect ad views.

String feature. We found that lots of the ad views havespecific string features in “class” field. Most ad librariesimplement the ad views themselves instead of using theAndroid system views. Thus the ad views have self-developed

class names that are different from system views such asandroid.widget.Button and android.widget.TextView. Besides,most developers tend to name the class of the ad viewswith some string hints such as “AdWebview” and “AdLayout”according to the coding style. Unfortunately, simply matchingad in the class name would lead to substantial portion of falsepositives as several non-ad view class names have ad in theirsegments which are common words (e.g., shadow, gadget,load, adapter, adobe). Thus, to work around this limitation,we collect all English words containing ad from SCOWL5(accounting for a total of 13,385 words), and dismiss classnames containing such words as potential ad views. So if theclass name of the view are self-developed or if the class namehas a string hint, it could be an ad view candidate.

Type feature. For apps without string features, ad views areimplemented by the Android system views. We find that allthe ad views in our ground truth apps belong to three types:“ImageView”, “WebView” and “ViewFlipper”. Thus the typeinformation is helpful in pinpoint possible ad views.

Placement feature. In general, most of the ad views havespecial size and location characteristics that we called place-ment features. Mobile ads could be displayed in three commonways: 1) Banner ad, ad view is located in the top or bottomof the screen; 2) Interstitial ad, ad view is square and locatedin the center of the screen; 3) Full table screen, ad view fillsthe whole screen. Many ad networks specify the ad size fordifferent types of mobile ads [17], which could be used toidentify possible ad views.

We use a combined heuristic approach to detect ad views.We first use string features to match possible self-developedad views and type features to match possible Android systemad views, lots of ad view candidates could be identified.Then we take advantage of placement features to furtherconfirm ad views from ad view candidates above. For knownad libraries, the ad view has relatively stable string andplacement features. We have manually labeled features forpopular libraries, including admob and waps. For views thatdo not have string features, we combine type and placementfeatures to distinguish ad views. Although the combined typeand placement features of some non-ad views are similar thoseof ad views, which could mislead our ad view detector, thefalse positives are acceptable (see Section VII-A).

B. Fraud Checker

We created a Fraud Checker based on a set of heuristicrules to detect all the seven kinds of ad fraud.

Explicit UI Transition Fraud. This type of fraud is mainlyfound in interstitial ad. Note that interstitial Ad is an isolatedactivity that covers the activity of host app, thus it is notable to get ad view and other views in a given state. FraudChecker first traverses all the UI states. If an UI state containsinteractive views (e.g., dialog or buttons) that entice users toclick, the Fraud Checker will check the following UI state toanalyze whether an ad view exists. Specifically, Fraud Checkertries to identify the state transition information by checkingthe transition graph. Fraud Checker detects the UI state as

explicit UI transition fraud if an interstitial ad is placed ontop of interactive views.

Implicit UI Transition Fraud. The implicit UI transitionfraud will trigger unsuspicious downloading behaviors (appsor files) without user’s interaction once clicking on ads. AnyUI state that meets the following rules will be identifiedas implicit UI transition fraud: 1) there are ad views inthis state; 2) there is a downloading behavior; and 3) thefollowing state does not have any interactive interface andstill belongs to the same activity, and 4) the state is triggeredby a touch event. Thus, we monitor the networking trafficat runtime, once the networking traffic increases greatly, wewill check whether the behavior is triggered by unsuspiciousapp downloading. In our experiment, we find the networkingtraffic speed of downloading behavior is generally higher than150KB/s, which is much higher than that of loading ads orother network behaviors. So if the networking traffic speedincreases greatly and exceeds 150KB/S, we will think thetough event triggered a downloading behavior.

Pop-up Window Fraud. Ads should not be placed in appsthat are running in the background of the device or outside ofthe app environment. Fraud Checker will get the list of activitynames that belongs to the app first. Then it scans all the UIstates, if the activity name of a state is not in the list andthere are ad views in this state, Fraud Checker will identifythis state as Pop-up window fraud.

Hidden Fraud. Fraud Checker detects an UI state as hiddenfraud if the ad view in the given UI state is (partially) coveredby non-ad views. In detail, Fraud Checker uses the size andlocation information to analyze whether ad views overlap withother non-ad views. The hierarchical relationship between theviews (more specifically, the z-coordinates of the views) couldbe used to determine whether an ad view is hidden.

Size Fraud. Many ad networks specify the ad size as advicefor different types of ad views [17] [20]. Although the ad sizevaries according to the screen size, the size ratio between adview and the screen is relatively stable. We define the size ratioas the value of the ad view area divided by the screen areaand use the size ratio as the criteria to figure out thresholdsfor different types of ad views based on extensive experiments.We set the size ratio as 0.005 for the banner ad, 0.2 for theinterstitial ad and 1 for the full table screen ad. The banner adview is usually located on top or bottom of the screen, whilethe interstitial or full table screen ad views are usually locatedin the center of the screen. Combining location informationand size information for a given ad view, Fraud Checker willcheck whether the size ratio violates the regulation.

Number Fraud. For each UI state, Fraud Checker will checkthe number of ad views and the size of each ad views. Notethat although there are no strict regulation of the number of adviews, some ad network behavior policies required that “thenumber of ads permitted to not exceed the amount of sitecontent” [18]. More specifically, Fraud Checker will analyzewhether the total display area of the ad views occupied morethan 50% of the screen area.

Overlap Fraud. In contrast to hidden fraud, a state is

TABLE ITHE EXPERIMENT RESULT OF AD VIEW DETECTOR.

Ad-contained (prediction) Ad-free (prediction)Ad-contained TP(176) FN(24)

Ad-free FP(49) TN(4008)

identified as overlap fraud if some ad views are placed ontop of other views. Thus overlap fraud detection is similarto hidden fraud. Size and location information are used toidentify whether the views are overlapped.

VI. IMPLEMENTATION

FrauDroid is implemented in Python, with more than 3,000lines of code. Our experiment is running on a Nexus 5smartphone with LineageOS [22] system instead of emulators,because some ad libraries refuse to display ads when it detectsthe app is running in an emulator environment.

FraudBot is based on DroidBot [23], a lightweight UI-guided test input generator, which is able to interact with anAndroid app on almost any device without instrumentation. Itcan generate UI-guided test inputs based on a state transitionmodel generated on-the-fly, and allow users to integrate theirown strategies or algorithms. We integrate the ad first explo-ration strategy for a high efficiency of ad fraud detection.

We take advantage of Hierarchy Viewer [24] to understandthe layout and obtain exhaustive information from each UIview such as name, size and class name. Based on thecomprehensive information from hierarchy viewer, we caneasily build a view tree that can accurate describe current state.

VII. EVALUATION

We first apply FrauDroid to a labeled dataset to measurethe accuracy and performance. Note that we evaluate AdView Detector and Fraud Checker separately. Then we applyFrauDroid to a large set of apps to study the prevalence of adfraud in major Android app markets.

A. Ad View Detector

We first measure the accuracy of Ad View Detector. Werandomly choose 500 popular apps from Google Play. Then,we use FraudBot to get UI view trees for each app (200seconds per app on average), and we get 4,257 states for the500 apps. Ad View Detector analyzes each UI state to identifyad views. We manually examined all the 4,257 states to labelthe ground truth.

As shown in Table I, Ad View Detector identified 225states that contain ad views, among which 49 states are falsepositives, achieving a detection precision of 78.2%. Besides,24 states that contain ad views were not detected, the recallrate reaches 88%. The result suggests that our detector com-bining string features and placement features has the abilityto identify ad views with a high recall rate and an acceptableprecision rate.

What caused false positives and false negatives? Themain reason leading to false positives is that some states haveno string features, and we can only use type features and

TABLE IITHE DISTRIBUTION OF 47 LABELED AD FRAUD APPS.

Explicit Implicit Pop-up Hidden Size Number Overlap5 33 2 2 3 2 2

TABLE IIITHE EXPERIMENT RESULT OF FRAUD CHECKER.

Ad fraud (prediction) W/O Ad fraud (prediction)Ad fraud TP(45) FN(2)

W/O Ad fraud FP(7) TN(34)

placement features to distinguish ad views from non-ad views.However, the type features and placement features of somenon-ad views are similar to those of ad views, which couldmislead our ad view detector.

The cause of false negatives is that FraudBot fails to getsome attributes of the ad view in a few cases. Then theattributes are set to a default value (e.g., 0*0 for size) thatmay be regarded as exception by Ad View Detector. The 24false negative states have some exception information (e.g.size, location) recorded by FraudBot and Ad View Detectorfails to identify them.

B. Fraud Checker

To evaluate the performance of Fraud Checker, we man-ually labeled a dataset of 47 apps which contains atleast one instance for each type of ad fraud. The dis-tribution of this labeled dataset is shown in Table II.Note that there may be multiple types of fraud in anapp, for example, two apps (anzhi.fighttrain andaz360.gba.jqrgsseed) were found containing both ex-plicit UI transition fraud and implicit UI transition fraud. Wealso labeled 41 apps that contain normal ads as comparison.

As shown in Table III, Fraud Checker achieves high accu-racy with a precision rate of 86.5% and a recall rate of 95.7%.

Among the seven false positive apps, three of them aremisclassified as pop-up window fraud, three of them aremisclassified as overlap fraud, and the one is misclassified asimplicit UI transition fraud. The main reason leading to falsepositives is that some ad views cannot be detected correctly,6 of the 7 false positive apps are incurred by inaccurate adview detection. Besides, one app is misclassified as implicit UItransition fraud, because it loads a large image in the ad view,which leads to a high networking traffic speed that exceeds150KB/S.

Two apps bypassed the detection of Fraud Checker, includ-ing one pop-up window fraud app and one overlap fraud app.By analyzing the results, we find that the size of ad view is350*0, which should be 350*50 as a banner ad size. FraudBotcan not get the hight of the view, causing Ad View Detectorto fail to identify the ad view through the placement features.

C. Performance Evaluation

Then we analyzed the performance of FrauDroid from twoaspects: (1) generating UI transition graph and (2) checking

Fig. 6. Performance evaluation of FrauDroid.

ad fraud. For different app size scale (100KB, 1MB, 10MB,50MB and 100MB), we choose 500 popular apps (in total2500 apps) and then analyze the time cost and memory costat runtime. Note that we do not consider the cost of ad librarydetection because our tool is based on LibRadar to detect adlibraries and the performance of LibRadar has been detailedanalyzed in previous work [21] [25].

As shown in Figure 6, the time cost and memory cost ofFraudBot grow linearly with the app size increasing, while theFraud Checker is independent of app size. The reason is that,large apps usually have more activities and states that need tobe loaded and traversed by FraudBot, while the performanceof Fraud Checker is mainly relying on UI transition graph,which is not directly relevant to app size. The average timecost of FraudBot is much higher than that of Fraud Checker(216.774 s vs 0.402 s) and both have a low memory cost atabout 20 MB. Thus the performance of FrauDroid is mainlydetermined by FraudBot.

VIII. A STUDY OF AD FRAUD IN MAJOR APP MARKETS

In this section, we study the prevalence of ad fraud in alarge scale of apps from 21 major Android stores. We collected80,539 Android apps, including 28,126 apps from Google Playand 52,413 apps from third-party app markets, as shown inTable IV. These apps are downloaded between February 18,2017 and Mar 19, 2017. Take the advantage of LibRadar [21],we could identify 9,940 apps from Google Play and 21,353apps from third-party app markets that use ad libraries.

A. Overall Result

As shown in Table IV, FrauDroid has identified 92 ad fraudapps (account for 0.29% of apps that use ad libraries in ourdataset), including 5 of them from Google Play and 87 of them

TABLE IVDATASET FROM 21 MAJOR ANDROID APP MARKETS.

App Market #Apps #Ad Apps #Fraud PercentageGoogle Play 28126 9940 5 0.05%360 Mobile Assistant 10471 3497 4 0.11%Baidu App Store 7891 2847 4 0.14%Baidu Mobile Assistant 4167 1868 8 0.43%NetEase App Store 3105 1156 10 0.87%Sina App Center 5643 2357 4 0.17%Gfan 1287 596 5 0.84%Liqu 1578 751 2 0.27%App Treasure 1333 643 2 0.31%MUMAYI 1572 711 4 0.56%AppKu 1270 588 4 0.68%Julur 1362 581 2 0.34%Huli 2690 871 0 0Haozhuo 1072 597 4 0.67%tvapk 1474 634 0 0anfone 1121 491 0 07BOX 1206 706 2 0.28%MM Store 1350 478 6 1.26%Anfen 1213 562 1 0.18%UA App Store 1057 551 2 0.36%PP Assistant 1560 868 23 2.65%Total 80539 31293 92 0.29%

TABLE VAD FRAUD DISTRIBUTION BY TYPE.

Explicit Implicit Pop-up Hidden Size Number Overlap13 71 8 2 7 5 1

from third-party markets. On average, 0.05% of the apps thatuse ad libraries in Google Play are ad fraud apps, while 2.65%of the apps that use ad libraries in PP Assistant market are adfraud apps.

These apps exist in various categories, especially in gamesand tools. Most of them are still available in the app marketswith a lots of downloads. For example, an audio player appnamed love ringtones2 has 160000 of downloads in BaiduMobile Assistant and it is marked as safe, which means thatthe market may pay little attention to ad frauds.

B. Ad Fraud Distribution by Type

For the 92 ad fraud apps, the distribution of different adfraud type is shown in Table V. More than 90% of ad fraudsbelong to dynamic interactive fraud, which has never beenexplored by previous studies. Implicit UI transition fraud andexplicit UI transition fraud took up for a large part. Staticplacement fraud only occupied a small portion of fraud apps.One possible reason is that dynamic interactive frauds are moredifficult to detect and easier to mislead users to trigger adimpressions and clicks, so developers and publishers tend touse more sophisticated techniques and tricks to cheat and getmore benefits.

C. Ad fraud Distribution by Ad Networks

We also explored the prevalence of ad fraud in differentad networks. The 92 ad fraud apps are distributed across

2http://shouji.baidu.com/software/11121977.html

TABLE VIDISTRIBUTION OF AD FRAUD APPS IN DIFFERENT AD NETWORK.

Ad network #Ad fraud Ad network #Ad fraudWaps 24 Adwhirl 2Admob 14 Yimob 1InMobi 6 Vpon 1BaiduAd 6 Ninebox 1Anzhi 6 Mediav 1XiaomiAd 5 lanjingke 1Youmi 4 Juzi 1Mobile7 4 Jumptap 1Admogo 4 Flurry 1Kugo 3 Domob 1FineFocus 2 Anzhuan360 1Dianjin 2 - -

23 different ad networks, including famous ad network (e.g.,Admob and Waps) and some other ad networks such asyimob and kugo, as shown in Table VI.

The result suggests that ad fraud is prevalent in ad networks.Admob and Waps are two of the most popular ad libraries,it stands to reason that they attracted the most number ofad fraud apps. Although Google Admob has released strictrestriction [17] on how ad views should be placed to avoidad fraud, we still found 14 ad fraud apps that use Admob adlibrary. The most popular one is Spider Man Total MayhemHD3 , with 6.2 million of downloads in 360 Mobile Assistantand we find a lot of implicit UI transition fraud complaints inthe comments section.

We could learn from the result that ad frauds distribute inmost markets and ad networks, it is threatening the mobileadvertising ecosystem and should be paid more attentions.Most ad fraud apps could be classified as grayware, for thereason that they contain annoying, undesirable or undisclosedbehaviors that cannot be classified as malware [26]. These graybehaviors do not cause direct loss to stakeholders, thus theydo not attract as much attention as malwares do.

We also find that only some famous ad networks havereleased privacy policies on how to use ad libraries, most adnetworks do not publish any policies. Thus it is necessaryfor all the ad networks to regulate the gray behaviors of appdevelopers. Besides, app markets should make efforts to detectand remove ad fraud apps.

D. Case Study

Figure 7 shows case studies of the seven types of ad fraudapps we detected from Google Play and third-party app mar-kets. Fig. 7 (1) shows an example of explicit UI transition fraud(app: com.android.yatree.taijiaomusic). The adview pops up above the exit dialog, causing undesirableclick on the ad instead of the exit button. This app mis-leads 9 of 10 users to click the ad view in our study.Fig. 7 (2) shows an example of pop-up window fraud(app: com.natsume.stone.android). The ad view wasdisplayed on the home screen even if the mobile usersexited the app. Fig. 7 (3) shows an example of implicit

3http://zhushou.360.cn/detail/index/soft id/16640

UI transition fraud (app: com.hongap.slider). Whenusers click the ad view shown on the left screenshot,the app starts to download directly without user’s interac-tion. Fig. 7 (4) shows an example of hidden fraud (app:forest.best.livewallpaper). The ad view is hiddenbehind the Email button. Fig. 7 (5) shows an example of sizefraud (app: com.dodur.android.golf ). The ad viewon the right side of the screenshot is too small for user toread. Fig. 7 (6) shows an example of number fraud (app:com.maomao.androidcrack). There are a total numberof 3 ad views placed on top of the page, which exceedsthe amount of site content. Fig. 7 (7) shows an example ofoverlap fraud (app: com.sysapk.wifibooster). The adview was placed on top of four buttons of the host app.

IX. DISCUSSION

In this section, we discuss possible limitations of ourapproach and discuss potential future improvements.

Ad view detection. We use a heuristic approach whichcombines string features and placement features to identifyad views. However, some apps are fully obfuscated, making itis hard to extract meaningful string features. Only placementfeatures can be used to identify ad views in this case. Thusone future work is to develop an obfuscation-resilient and moreaccurate approach to identify ad views.

Ad state coverage. More than 90% ads are displayed inmain UI state and exit UI state in our experiment. Thus weoptimize our UI exploration strategy to achieve a balancebetween performance and UI coverage. It is quite possible thatFraudBot with traversal depth as one cannot traverse all thepossible UI states that contain ad views, so we could expandthe ad state coverage by setting a larger traversal depth.

Other types of ad fraud. Our work is the first to identify“dynamic interactive fraud” and we have built a tool toautomated detect all the seven types of ad fraud in Androidapps. However, there may be other types of ad fraud in mobileapps, so we could extend FrauDroid to detect other new typesof fraud in the future. After a manual audit and code reviewof 92 ad fraud apps, we find that the developers have themain incentive for conducting ad fraud. All of the ad fraudbehaviors are caused by the misuse or improper use of the adlibraries provided by ad networks. Although we do not find adlibraries directly involved in ad fraud, but we can not excludea small number of ad libraries is the possibility of conductingad fraud.

X. RELATED WORK

FrauDroid is inspired by two lines of related work.

A. Ad Fraud Detection

Ad fraud in web advertising has been extensively studied.Most of them focus on click fraud, including detecting clickfraud based on networking traffic [8] [9] or search enginequery logs [10], characterizing click fraud [11] [12] [13] andanalyzing its profit model [14].

!"#$%&'(')* +",-%&'(')

."/012&3%

4"56%78%*9':;69

<"=';;1: >"?8-@12 A"B'C1

Fig. 7. Case study of ad fraud apps.

Existing studies on mobile ad fraud mainly focus on clickfraud and static placement fraud. Crussell et al. [6] developeda novel approach for automatically identifying click fraudin three steps: building HTTP request trees, identifying adrequest pages using machine learning, and detecting clicksin HTTP request trees using heuristic rules. Liu et al. [4]investigated static placement frauds on Windows Phone byanalyzing the layouts of apps. Their approach used automatedapp navigation, together with optimizations to scan through alarge number of visual views and then a set of rules to detectwhether ad fraud exists.

However, with the evolution of ad fraud, new types of adfraud become prevalent and they cannot be handled by state-of-the-art approaches. We introduce a new category calleddynamic interactive fraud and design an automation ad frauddetection tool to detect both dynamic interactive fraud andstatic placement fraud in Android apps.

B. App Automation

Several Android automation framework such as HierarchyViewer [24], UIAutomator [27] and Robotium [28] are pro-vided by Android platform for app testing. But these toolsrely on developer to provide automation scripts and are notsuitable for most scenarios such as compatibility testing andmalware analysis [4].

Thus a great deal of research has been performed onfull app automation, especially on the automated test inputgeneration for Android [29]. As Android apps are event-driven, these testing tools can generate event inputs following

different strategies, including random, model-based and sys-tematic exploration strategy. Monkey [30], Dynodroid [31],Intent Fuzzer [32] and DroidFuzzer [33] are proposed toemploy a random strategy. They are mainly used for stresstesting. However, a random strategy would hardly be able togenerate highly specific inputs. Moreover, these tools are notaware of how much behavior of the app under test has beenalready covered, and thus are likely to generate redundantevents that do not help the exploration. So random strategytools are not suitable for ad fraud detection. Model-basedstrategy tools [34] (e.g., GUIRipper [35], MobiGUITAR [36],A3E [37], SwiftHand [38], PUMA [39], etc.) build a GUImodel of the app and use it to generate events and system-atically explore the behavior of the app. Most of them usefinite state machines as model and activities as states andevents as transitions, and then implement a DFS or BFSstrategy to explore the states for a more effective results interms of code coverage. Systematic strategy tools [40] [41]use more sophisticated techniques such as symbolic executionand evolutionary algorithms to guide the exploration towardscode that can only be revealed upon specific inputs. However,most of them need to either instrument the system or the app,thus cannot be directly used to detect ad fraud. Our work isimplemented based on DroidBot [23], a lightweight UI-guidedtest input generator, which is able to interact with Androidapps without instrumentation. It allows users to integrate theirown strategies for different scenarios which make it possibleto detect ad fraud. We improved DroidBot and implemented a

more sophisticate ad view exploration strategy for automatedscalable ad fraud detection.

XI. CONCLUSION

In this paper, we propose an explorative study of ad fraudin Android apps. We first create a taxonomy of existingmobile ad frauds. Besides static placement fraud, we alsocreated a new category called dynamic interactive fraud thathas never been explored by previous work. Then we proposeFrauDroid, a new approach to detect ad frauds based on UItransition graph and networking traffic. Based on extensiveexperiments on smartphones, we have identified 92 ad fraudapps, roughly accounted for 0.3% of the apps that use adlibraries in our study. The result suggests that ad fraud isthreatening the mobile advertising ecosystem and should bepaid more attentions.

REFERENCES

[1] N. Viennot, E. Garcia, and J. Nieh, “A measurement study of GooglePlay,” in The 2014 ACM International Conference on Measurement andModeling of Computer Systems (SIGMETRICS’14), 2014, pp. 221–233.

[2] M. C. Grace, W. Zhou, X. Jiang, and A. R. Sadeghi, “Unsafe exposureanalysis of mobile in-app advertisements,” in ACM Conference onSecurity and Privacy in Wireless and Mobile Networks, 2012, pp. 101–112.

[3] K. Springborn and P. Barford, “Impression fraud in online advertisingvia pay-per-view networks,” in Usenix Conference on Security, 2013,pp. 211–226.

[4] B. Liu, S. Nath, R. Govindan, and J. Liu, “Decaf: Detecting andcharacterizing ad fraud in mobile apps.” in NSDI, 2014, pp. 57–70.

[5] S. Nath, F. X. Lin, L. Ravindranath, and J. Padhye, “Smartads: bringingcontextual ads to mobile apps,” in Proceeding of the InternationalConference on Mobile Systems, Applications, and Services, 2013, pp.111–124.

[6] J. Crussell, R. Stevens, and H. Chen, “Madfraud: Investigating ad fraudin android applications,” in Proceedings of the 12th annual internationalconference on Mobile systems, applications, and services. ACM, 2014,pp. 123–134.

[7] “Mobile ad fraud: Definition, types, detection,” 2017, https://clickky.biz/blog/2016/12/mobile-ad-fraud-definition-types-detection/, AccessedJuly 20, 2017.

[8] A. Metwally, D. Agrawal, and A. El Abbadi, “Detectives: Detectingcoalition hit inflation attacks in advertising networks streams,” in Pro-ceedings of the 16th International Conference on World Wide Web, ser.WWW ’07, 2007, pp. 241–250.

[9] A. Metwally, F. Emekci, D. Agrawal, and A. El Abbadi, “Sleuth:Single-publisher attack detection using correlation hunting,” Proc. VLDBEndow., vol. 1, no. 2, pp. 1217–1228, 2008.

[10] F. Yu, Y. Xie, and Q. Ke, “Sbotminer: Large scale search bot detection,”in Proceedings of the Third ACM International Conference on WebSearch and Data Mining, ser. WSDM ’10, 2010, pp. 421–430.

[11] S. A. Alrwais, A. Gerber, C. W. Dunn, O. Spatscheck, M. Gupta,and E. Osterweil, “Dissecting ghost clicks: Ad fraud via misdirectedhuman clicks,” in Proceedings of the 28th Annual ComputerSecurity Applications Conference, ser. ACSAC ’12. New York,NY, USA: ACM, 2012, pp. 21–30. [Online]. Available: http://doi.acm.org/10.1145/2420950.2420954

[12] T. Blizard and N. Livic, “Click-fraud monetizing malware: A survey andcase study,” in Proceedings of the 2012 7th International Conferenceon Malicious and Unwanted Software (MALWARE), ser. MALWARE’12. Washington, DC, USA: IEEE Computer Society, 2012, pp.67–72. [Online]. Available: http://dx.doi.org/10.1109/MALWARE.2012.6461010

[13] B. Miller, P. Pearce, C. Grier, C. Kreibich, and V. Paxson, “What’sclicking what? techniques and innovations of today’s clickbots,” in Pro-ceedings of the 8th International Conference on Detection of Intrusionsand Malware, and Vulnerability Assessment, ser. DIMVA’11, 2011, pp.164–183.

[14] D. McCoy, A. Pitsillidis, G. Jordan, N. Weaver, C. Kreibich, B. Krebs,G. M. Voelker, S. Savage, and K. Levchenko, “Pharmaleaks: Under-standing the business of online pharmaceutical affiliate programs,” inProceedings of the 21st USENIX Conference on Security Symposium,ser. Security’12, 2012, pp. 1–1.

[15] G. Cho, J. Cho, Y. Song, and H. Kim, “An empirical study of clickfraud in mobile advertising networks,” in Availability, Reliability andSecurity (ARES), 2015 10th International Conference on. IEEE, 2015,pp. 382–388.

[16] K. Lee, J. Flinn, T. J. Giuli, B. Noble, and C. Peplin, “Amc: verifyinguser interface properties for vehicular applications,” in Proceeding of the11th annual international conference on Mobile systems, applications,and services. ACM, 2013, pp. 1–12.

[17] “Google admob & adsense policies,” 2017, https://support.google.com/admob#topic=2745287, Accessed July 20, 2017.

[18] “Doubleclick ad exchange program policies,” 2017, https://support.google.com/adxseller/topic/7316904?hl=en&ref topic=6321576,Accessed July 20, 2017.

[19] “China communications standards association: Mobile intelligent ter-minal malicious push information to determine the technical require-ments,” 2017, http://www.ccsa.org.cn/ccsa tx/ly to std.php?tbname=project&ly num=FB03, Accessed July 20, 2017.

[20] “Firebase banner ads,” 2017, https://firebase.google.com/docs/admob/android/banner, Accessed July 20, 2017.

[21] Z. Ma, H. Wang, Y. Guo, and X. Chen, “Libradar: Fast and accuratedetection of third-party libraries in android apps,” in Proceedings ofthe 38th International Conference on Software Engineering Companion.ACM, 2016, pp. 653–656.

[22] T. L. Project, “Lineageos android distribution,” 2017, https://www.lineageos.org, Accessed July 20, 2017.

[23] Y. Li, Z. Yang, Y. Guo, and X. Chen, “Droidbot: a lightweight ui-guidedtest input generator for android,” in Proceedings of the 39th InternationalConference on Software Engineering Companion. IEEE Press, 2017,pp. 23–26.

[24] “Profile your layout with hierarchy viewer,” 2017, https://developer.android.com/studio/profile/hierarchy-viewer.html, Accessed July 20,2017.

[25] H. Wang and Y. Guo, “Understanding third-party libraries in mobile appanalysis,” in 2017 IEEE/ACM 39th International Conference on SoftwareEngineering Companion (ICSE-C), May 2017, pp. 515–516.

[26] B. Andow, A. Nadkarni, B. Bassett, W. Enck, and T. Xie, “A study ofgrayware on google play,” in Security and Privacy Workshops (SPW),2016 IEEE. IEEE, 2016, pp. 224–233.

[27] “Android developers. uiautomator,” 2017, https://developer.android.com/training/testing/ui-testing/uiautomator-testing.html, Accessed July 20,2017.

[28] “Robotium developers group,” 2017, https://github.com/RobotiumTech/robotium, Accessed July 20, 2017.

[29] S. R. Choudhary, A. Gorla, and A. Orso, “Automated test input gen-eration for android: Are we there yet? (e),” in Ieee/acm InternationalConference on Automated Software Engineering, 2015, pp. 429–440.

[30] A. Developers, “The monkey ui android testing tool,” 2017, http://developer.android.com/tools/help/monkey.html, Accessed July 21, 2017.

[31] A. Machiry, R. Tahiliani, and M. Naik, “Dynodroid: An input generationsystem for android apps,” in Proceedings of the 2013 9th Joint Meetingon Foundations of Software Engineering. ACM, 2013, pp. 224–234.

[32] R. Sasnauskas and J. Regehr, “Intent fuzzer: crafting intents of death,”in Joint International Workshop on Dynamic Analysis, 2014, pp. 1–5.

[33] H. Ye, S. Cheng, L. Zhang, and F. Jiang, “Droidfuzzer: Fuzzing theandroid apps with intent-filter tag,” in International Conference onAdvances in Mobile Computing & Multimedia, 2013, p. 68.

[34] T. Takala, M. Katara, and J. Harty, “Experiences of system-level model-based gui testing of an android application,” in Software Testing,Verification and Validation (ICST), 2011 IEEE Fourth InternationalConference on. IEEE, 2011, pp. 377–386.

[35] D. Amalfitano, “Using gui ripping for automated testing of androidapplications,” in Proceedings of the Ieee/acm International Conferenceon Automated Software Engineering, 2012, pp. 258–261.

[36] D. Amalfitano, A. Fasolino, P. Tramontana, and B. Ta, “Mobiguitar – atool for automated model-based testing of mobile apps,” IEEE Software,vol. 32, no. 5, pp. 1–1, 2014.

[37] T. Azim and I. Neamtiu, “Targeted and depth-first exploration forsystematic testing of android apps,” Acm Sigplan Notices, vol. 48, no. 10,pp. 641–660, 2013.

[38] W. Choi, G. Necula, and K. Sen, “Guided gui testing of androidapps with minimal restart and approximate learning,” in ACM SigplanInternational Conference on Object Oriented Programming SystemsLanguages & Applications, 2013, pp. 623–640.

[39] S. Hao, B. Liu, S. Nath, W. G. J. Halfond, and R. Govindan, “Puma:programmable ui-automation for large-scale dynamic analysis of mobileapps,” in International Conference on Mobile Systems, Applications, andServices, 2014, pp. 204–217.

[40] R. Mahmood, N. Mirzaei, and S. Malek, “Evodroid: segmented evo-lutionary testing of android apps,” in The ACM Sigsoft InternationalSymposium, 2014, pp. 599–609.

[41] S. Anand, M. Naik, M. J. Harrold, and H. Yang, “Automated concolictesting of smartphone apps,” in ACM Sigsoft International Symposiumon the Foundations of Software Engineering, 2012, pp. 1–11.