abusing phone numbers and cross-application features · pdf filelike truecaller and facebook...

14
Abusing Phone Numbers and Cross-Application Features for Crafting Targeted Attacks Srishti Gupta Indraprastha Institute of Information Technology Delhi [email protected] Payas Gupta New York University Abu Dhabi [email protected] Mustaque Ahamad Georgia Institute of Technology New York University Abu Dhabi [email protected] Ponnurangam Kumaraguru Indraprastha Institute of Information Technology Delhi [email protected] ABSTRACT With the convergence of Internet and telephony, new ap- plications (e.g., WhatsApp) have emerged as an important means of communication for billions of users. These ap- plications are becoming an attractive medium for attackers to deliver spam and carry out more targeted attacks. Since such applications rely on phone numbers, we explore the fea- sibility, automation, and scalability of phishing attacks that can be carried out by abusing a phone number. We demon- strate a novel system that takes a potential victim’s phone number as an input, leverages information from applications like Truecaller and Facebook about the victim and his / her social network, checks the presence of phone number’s owner (victim) on the attack channels (over-the-top or OTT mes- saging applications, voice, e-mail, or SMS), and finally tar- gets the victim on the chosen channel. As a proof of concept, we enumerate through a random pool of 1.16 million phone numbers. By using information provided by popular applica- tions, we show that social and spear phishing attacks can be launched against 51,409 and 180,000 users respectively. Fur- thermore, voice phishing or vishing attacks can be launched against 722,696 users. We also found 91,487 highly attrac- tive targets who can be attacked by crafting whaling attacks. We show the effectiveness of one of these attacks, phishing, by conducting an online roleplay user study. We found that social (69.2%) and spear (54.3%) phishing attacks are more successful than non-targeted phishing attacks (35.5%) on OTT messaging applications. Although similar results were found for other mediums like e-mail, we demonstrate that due to the significantly increased user engagement via new communication applications and the ease with which phone Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00. numbers allow collection of information necessary for these attacks, there is a clear need for better protection of OTT messaging applications. We propose some recommendations in this direction. Keywords Phone number, Over-The-Top, Phishing, Roleplay, Face- book, Truecaller, WhatsApp, Vishing. 1. INTRODUCTION We are being constantly targeted by cyber criminals who rely on a variety of online attacks to victimize users and enterprises. Phishing, which is a form of social engineering attacks, is often used by such criminals to fraudulently gain access to sensitive information or systems by impersonating a trusted party [35]. In the past, phishing attacks have used the e-mail and web channels to reach their victims. How- ever, recently, there has been a tremendous growth of similar phishing attempts over the telephony channel. New forms of phishing attacks have emerged exploiting traditional text messaging services, i.e., SMS (smishing [11]) and voice phish- ing (vishing [14]). Several factors make the telephony channel attractive for cyber criminals. The convergence of telephony with the In- ternet has resulted in an unprecedented growth of new forms of online communication, especially mobile communication due to the advent of Over-The-Top (OTT) messaging appli- cations (like WhatsApp, Viber, and WeChat [6]). Because of the growing popularity of OTT messaging applications, particularly WhatsApp, malicious actors are now abusing it for illicit activities like delivering spam and phishing mes- sages. Unsolicited messages like investment advertisements, adult conversation ads (random contacts requests) were seen to propagate on the channel in early 2015 [7]. Vishing attacks are also increasing due to 1) low mobile- mobile calling plans with the advent of services like Skype and Google Voice. This allows spammers to pump out huge volumes of voice calls at a marginal cost, and 2) easy caller ID spoofing due to Voice over IP phone technology that al- lows spammers to pick an area code and even the prefix arXiv:1512.07330v1 [cs.SI] 23 Dec 2015

Upload: dinhcong

Post on 22-Feb-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Abusing Phone Numbers and Cross-Application Featuresfor Crafting Targeted Attacks

Srishti GuptaIndraprastha Institute ofInformation Technology

[email protected]

Payas GuptaNew York University

Abu [email protected]

Mustaque AhamadGeorgia Institute of

TechnologyNew York University Abu

[email protected]

PonnurangamKumaraguru

Indraprastha Institute ofInformation Technology

[email protected]

ABSTRACTWith the convergence of Internet and telephony, new ap-plications (e.g., WhatsApp) have emerged as an importantmeans of communication for billions of users. These ap-plications are becoming an attractive medium for attackersto deliver spam and carry out more targeted attacks. Sincesuch applications rely on phone numbers, we explore the fea-sibility, automation, and scalability of phishing attacks thatcan be carried out by abusing a phone number. We demon-strate a novel system that takes a potential victim’s phonenumber as an input, leverages information from applicationslike Truecaller and Facebook about the victim and his / hersocial network, checks the presence of phone number’s owner(victim) on the attack channels (over-the-top or OTT mes-saging applications, voice, e-mail, or SMS), and finally tar-gets the victim on the chosen channel. As a proof of concept,we enumerate through a random pool of 1.16 million phonenumbers. By using information provided by popular applica-tions, we show that social and spear phishing attacks can belaunched against 51,409 and 180,000 users respectively. Fur-thermore, voice phishing or vishing attacks can be launchedagainst 722,696 users. We also found 91,487 highly attrac-tive targets who can be attacked by crafting whaling attacks.We show the effectiveness of one of these attacks, phishing,by conducting an online roleplay user study. We found thatsocial (69.2%) and spear (54.3%) phishing attacks are moresuccessful than non-targeted phishing attacks (35.5%) onOTT messaging applications. Although similar results werefound for other mediums like e-mail, we demonstrate thatdue to the significantly increased user engagement via newcommunication applications and the ease with which phone

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$15.00.

numbers allow collection of information necessary for theseattacks, there is a clear need for better protection of OTTmessaging applications. We propose some recommendationsin this direction.

KeywordsPhone number, Over-The-Top, Phishing, Roleplay, Face-book, Truecaller, WhatsApp, Vishing.

1. INTRODUCTIONWe are being constantly targeted by cyber criminals who

rely on a variety of online attacks to victimize users andenterprises. Phishing, which is a form of social engineeringattacks, is often used by such criminals to fraudulently gainaccess to sensitive information or systems by impersonatinga trusted party [35]. In the past, phishing attacks have usedthe e-mail and web channels to reach their victims. How-ever, recently, there has been a tremendous growth of similarphishing attempts over the telephony channel. New formsof phishing attacks have emerged exploiting traditional textmessaging services, i.e., SMS (smishing [11]) and voice phish-ing (vishing [14]).

Several factors make the telephony channel attractive forcyber criminals. The convergence of telephony with the In-ternet has resulted in an unprecedented growth of new formsof online communication, especially mobile communicationdue to the advent of Over-The-Top (OTT) messaging appli-cations (like WhatsApp, Viber, and WeChat [6]). Becauseof the growing popularity of OTT messaging applications,particularly WhatsApp, malicious actors are now abusing itfor illicit activities like delivering spam and phishing mes-sages. Unsolicited messages like investment advertisements,adult conversation ads (random contacts requests) were seento propagate on the channel in early 2015 [7].

Vishing attacks are also increasing due to 1) low mobile-mobile calling plans with the advent of services like Skypeand Google Voice. This allows spammers to pump out hugevolumes of voice calls at a marginal cost, and 2) easy callerID spoofing due to Voice over IP phone technology that al-lows spammers to pick an area code and even the prefix

arX

iv:1

512.

0733

0v1

[cs

.SI]

23

Dec

201

5

number they want when they set up a new phone number.These numbers can be used to disguise where calls origi-nate. To avoid falling victim to such vishing attacks andknow more about the incoming phone number, cloud-basedcaller identification services are emerging to help in gettingadditional information about the caller. Millions of peopleare using such applications, namely Truecaller [12], Face-book’s Hello [4], and Whitepages Caller ID applications andBlock [17].

OTT messaging applications and caller ID applicationsuse a phone (mobile) number, a personally identifiable pieceof information with which an individual can be associateduniquely, in most cases [50]. OTT messaging applicationsuse it to uniquely identify users and allow them to find theirfriends who also use the same application and caller ID ap-plications provide additional information about the callingphone number. Prior research has shown that other Inter-net resources like e-mail addresses can be exploited as anidentifier to launch targeted phishing attacks [35, 47], dis-tributed phishing attacks [37], and to correlate user identi-ties across social networking platforms [21]. In this paperwe demonstrate how attackers can exploit phone numberas a unique identifier and use cross-application features forlaunching spear [47] and social phishing attacks [35].

A challenge spammers face when using an e-mail addressis that this medium is heavily defended and spammers of-ten cannot ensure that a spam message has been deliveredand seen by the target user. Furthermore, unlike e-mailaddresses that come from an unlimited pool and can befreely created, phone numbers are a limited and controlledresource. People generally retain the same number for along period due to the cost associated with it [10]. Also,phone numbers are a finite pool with a defined numberingplan. They can be easily enumerated by looping throughthe entire pool of number space. However, it is possiblethat not all phone numbers are currently allocated to usersand some of them may be unassigned. Thus, determining ifa phone number is currently assigned and its owner can bereached is a challenge that needs to be overcome before suchapplications can be targeted.

In this paper, we demonstrate how a phone number can beused across multiple applications to aggregate private andpersonal information about the owner of the phone number.Such information can then be used for targeted attacks. Forexample, reverse-lookup contact feature used by caller IDapplications like Truecaller [12] can be exploited to find moredetails (name) about the phone number’s owner. Moreover,by correlating this with the public information present ononline social networking platforms (e.g., Facebook), attack-ers can determine the social circle (friends) of the victim.Such information can then be used to launch a variety ofphishing attacks.

We focus on exploring how cross-application features canbe exploited to harvest information that can facilitate tar-geted and non-targeted attacks on different channels viz.,OTT messaging applications, voice, e-mail, or SMS. First,we demonstrate how to craft targeted spear and social phish-ing attacks against a random pool of phone numbers on OTTmessaging applications. OTT messaging applications allowattackers to find relevant information about targets by ex-ploiting address book syncing feature which helps to discoverfriends on a given OTT messaging application (e.g., What-sApp). Second, we demonstrate a novel targeted vishing at-

tack that can be carried out by compromising the integrityof caller ID applications; an attacker can create a convincingprofile to gain victim’s trust. Also, because the information(e.g. name) provided to these applications during registra-tion is not verified fully, it is relatively easy to add accountswith false information and impersonate trusted entities suchas banks. By making a call appear to come from such enti-ties, it is easy to deceive people into giving out their personalinformation like bank account number, credit card numberetc. The success rate of vishing attacks can be increased bymaking them more personalized and targeted by collectinginformation about the victim from Truecaller and Facebook.Third, we provide early evidence of the feasibility of craft-ing whaling attacks [15], attacks that are targeted againstthe owners of vanity numbers [9], phone numbers generallyowned by people with high influence or high-net-worth indi-viduals, who are very attractive targets for criminals.

By developing an automated and scalable system that al-lows such attacks to be crafted at scale and evaluating theeffectiveness of phishing attacks on OTT messaging applica-tions (as a proof of concept) with a roleplay user study, wemake the following contributions on three different fronts:

• Feasibility: This is the first attempt to systemati-cally understand the threat posed by the ease of cor-relating user information across caller ID lookup ap-plication (Truecaller), and social networking applica-tion (Facebook). This was executed using phone num-bers as unique identifiers. We show the attack is feasi-ble with easily available computational resources, andposes a significant security and privacy threat. An at-tacker can use these cross-application features to launchhighly targeted attacks on multiple channels like OTTmessaging applications, voice, e-mail, or SMS.

• Automation: We design and implement an auto-mated system that takes a phone number as an input,collects necessary information about the victim (ownerof the phone number). It can automatically determinethe target attack channel (OTT messaging applica-tions, voice, e-mail, or SMS), and finally crafts an at-tack vector (both targeted and non-targeted) to launchan attack against the victim on the chosen channel.

• Scalability: For 1,162,696 random pool of Indian phonenumbers that we enumerated, it is possible to launchsocial and spear phishing attacks against 51,409 and180,000 users respectively. Vishing attacks can be launchedagainst 722,696 users. We also found 91,487 highly in-fluential victims who can be attacked by crafting whal-ing attacks against them.

To demonstrate the effectiveness of phishing attacks onOTT messaging applications, we present results by conduct-ing a roleplay user study with 314 participants recruitedfrom Amazon Mechanical Turk (MTurk). Our results areconsistent with prior research on e-mail and online socialnetworks [21, 35] and confirm empirically that social phish-ing (69.2%) is the most successful attack on OTT messagingapplications, as compared to spear (54.3%) and non-targetedphishing (35.5%) attacks.

To the best of our knowledge, this is the first explorationof large-scale targeted attacks abusing phone numbers alongwith an evaluation of the effectiveness of the attacks. Given

that telephony medium is not as well defended as e-mail, webelieve that these contributions offer a promising new direc-tion and demonstrate the urgent need for better security forsuch applications.

2. RELATED WORKIn this section, we briefly outline some of the prior re-

search related to abusing phone numbers; vishing attacksand Spam over Internet Telephony (see Section 2.1) andabusing address book syncing feature of smartphones foruser profiling (see Section 2.2), We also discuss some of therelated work on launching targeted attacks on online socialmedia (see Section 2.3).

2.1 Vishing Attacks and SPITDue to low cost and scalability of VoIP based calling sys-

tems, scammers are using the telephony channel to makemillions of call and expand the vishing ecosystem. Priorwork has explored the detection and ways to combat scamson VoIP. Griffin et al. demonstrated that vishing attackscan be carried out using VoIP [30]. They illustrated howseveral vishing attacks can be crafted in order to increaseinformation security awareness. Chiappetta et al. ana-lyzed VoIP CDRs (Call Detail Records) to build featuresthat can classify normal or malicious users during voice com-munication [26]. The features were built using mutual in-teractions and communication patterns between the users.Past literature demonstrates detection of spam over VoIPthrough semi-supervised clustering [49], constructing multi-stage spam filter based on trust and reputation of callers [27],comparing human communication patterns with hidden Tur-ing tests to detect botnets [44], building a system using fea-tures like call duration, social networks, and global reputa-tion [19], proposing protection model based on user-profileframework such as users’ habits [45], placing telephone hon-eypots to collect intelligence about telephony attacks [31,33], and using call duration and traffic rate [39]. Caller IDspoofing is being used by scammers to hide their real identityand make fraudulent calls. Researchers have implementedvarious solutions to detect caller ID spoofing, using covertchannels built on timing estimation and call status for veri-fication [43], identifying the caller by tracing the calls to thecorresponding SIP-ISUP interworking gateway [48], usingcustomer’s phonebook feature for storing white and blacklists for filtering unwanted voice calls [24], and detecting au-dio codecs in call path, calculating packet loss and noiseprofiles to determine source and path of the call [20].

In this work, we present first evidence of feasibility of tar-geted vishing attacks by exploiting the integrity of the infor-mation provided by caller ID applications e.g., Truecaller.

2.2 Abusing Address Book Syncing in OTT Mes-saging Applications

Recent work shows that collection of user profiles canbe automated and yields a lot of personal information likephone numbers, display names, and profile pictures [25, 38].Schrittwieser et al. analyzed popular OTT messaging ap-plications like WhatsApp, Viber, Tango etc. and evalu-ated their security models with a focus on authenticationmechanisms [46]. They also highlighted the enumerationand privacy-related attacks that are possible due to addressbook syncing feature of these applications. Antonatos et al.proposed HoneyBuddy, an active honeypot infrastructure

designed to detect malicious activities in Instant Messagingapplications like MSN [18]. It automatically finds peopleusing a particular messaging service and adds them to itscontacts. These findings confirmed the ineffectiveness of ex-isting security measures in Instant Messaging services.

In our research, we further demonstrate how cross-applicationfeatures can be abused to launch targeted attacks; addressbook sync feature in OTT messaging applications and ex-ploiting integrity of information in caller ID applications.

2.3 Targeted Attacks and User ProfilingBilge et al. launched automated identity theft attacks

via profiling users on SNS (Social Networking Services) byemploying friend relationship with the victims [22]. The au-thors showed that people tend to accept friend requests fromstrangers on social networks. Balduzzi et al. presented ex-periments conducted on“social phishing” [21]. They crawledsocial networking sites to obtain publicly available informa-tion about users and manually crafted phishing e-mails con-taining certain information about them. This study showedthat victims are more likely to fall for phishing attempts ifsome information about their friends or about themselvesis included in the phishing e-mail. Jagatic et al. showedthat Internet users might be over four times more likelyto become victims if the sender is an acquaintance [35].Gupta et al. showed that inference attacks can be em-ployed to harvest real interests of people and subsequentlybreak mechanisms that use such personal information foruser authentication [32]. Huber et al. presented friend-in-the-middle-attack on Facebook which could leverage socialinformation about users in an automated fashion [34]. Theyfurther pointed out the possibility of context-aware spamand social phishing attacks, where attacks were found to becheap in terms of cost and hardware. Boshmaf et al. high-lighted vulnerabilities that can be exploited by social botsto infiltrate OSNs [23]. They showed that social bots canmimic real users and exploit friendship network leading tostrong privacy implications. Kurowski showed a manual at-tack on WhatsApp to retrieve personal information aboutvictims and proposed the feasibility of social phishing at-tacks against victims [42].

In this paper, we demonstrate how phone numbers can beused for automated cross-application targeted attacks wherepeople are attacked on one application by leveraging infor-mation from multiple other applications i.e. using True-caller and Facebook to launch targeted phishing, vishing,and whaling attacks on OTT messaging applications, voice,e-mail, or SMS.

3. SYSTEM OVERVIEW: FEASIBILITY ANDAUTOMATION

In this section, we demonstrate the feasibility and theease with which different targeted attacks can be craftedby abusing phone numbers. To automate the whole process,we build a system that exploits cross-application features tocollect information about a phone’s user and determines theattack channel (OTT messaging applications, voice, e-mail,or SMS) and targeted or non-targeted attack vectors (seeFigure 1). Specifically, the system has four main steps. a)Based on a numbering plan, phone numbers are randomlygenerated and inserted into an address book of a smart-phone. This address book is on a device that is under the

control of the attacker. b) The system fetches data fromTruecaller and Facebook applications to determine any ad-ditional information about the owners of those phone num-bers. c) After the information is aggregated, the systemdetermines the attack channel viz. OTT messaging applica-tions, voice, e-mail, or SMS, and d) finally crafts an attackvector and targets the victim with the best possible attack(based on the information collected).

Figure 1: System for Cross-Application InformationGathering and Attack Architecture.

3.1 Step 1: Setting Up Attack DeviceThis section elaborates phone number generation and set-

ting up the device under attacker’s control, once phone num-bers are generated. The system generates a large pool ofphone numbers which could be exploited by an attackerto launch targeted attacks. There are several methods toobtain a pool of phone numbers; consolidating white-pagesdirectory or any other public online directories, or scrap-ing the Internet using regex patterns. We chose the easiestmethod for an attacker; taking random phone numbers asinitial seeds, incrementing the digits by one to obtain a suf-ficient pool. Unlike e-mail addresses, the phone number setis finite, therefore, an entire range can be enumerated andinserted into the address book. This may give a few missesas some phone numbers may not be allocated for generaluse. Once phone numbers are generated, attacker initializesthe address book of the device under his control. The phonenumbers added in the address book are now his potentialvictims for carrying out various kinds of targeted attacks asdemonstrated in this paper.

3.2 Step 2: Collecting Information for an At-tack Vector

In this step, the system aggregates all the available infor-mation to launch an attack against the victim. To obtaininformation about the victim, we used Truecaller, an appli-cation that enables searching contact information using aphone number [12]. Its legitimate use is to identify incom-ing callers and block unwanted calls. It is a global collab-orative phone directory that keeps data of more than onebillion people around the globe. We used Truecaller as anexample, but any such application can be used to determinethis information. Truecaller also maintains data from socialnetworking sites and correlates this information to create alarge dataset for people who register on it. Also, due to itsaddress book syncing feature, it retrieves information about

contacts (friends) of the “owner of the phone number” whoinstalled it too. The ‘search’ endpoint of Truecaller applica-tion provides details of an individual like:

name, address, phone number, country, TwitterID, e-mail, Facebook ID, Twitter photo URL, andphoto URL.

However, the private information obtained is according tothe privacy settings of users.

We automated the whole process of fetching informationabout phone numbers from Truecaller. We exploited thesearch end-point (used to search information about a ran-dom phone number) to obtain the registration ID corre-sponding to a particular phone number 1. This was nec-essary to make authenticated requests and retrieve the in-formation from their servers. We extracted the registra-tion ID from the network packet sent while searching a ran-dom phone number on Truecaller application installed onour iPhone as shown in Figure 2. Once the registrationID was obtained, we programmatically fetched informationfor phone numbers in our dataset. Multiple instances ofthe process were initiated, on a 2.5 GHz Intel i5 processor,4GB RAM at the rate of 3000 requests / min. We workedwith only one registration ID for not abusing the Truecallerservers and effecting its services, however, it is easy for anattacker to scale the process by collecting multiple registra-tion IDs to bypass rate limits imposed by Truecaller.

Figure 2: Screenshot of a network packet which isused to obtain the registration ID from Truecallerto fetch information from its servers.

To obtain the social network of the victim, we used Face-book, the largest social network of family and friends [1]. Weassume that friends obtained will be related to the personin some way or the other which can increase the probabil-ity of success of a social phishing attack. However, we donot differentiate between the affinity of a friend Alice withthe victim as compared to another friend Charlie. Though,there may be greater affinity with one friend as comparedto the other in the real-world, however, in this paper, wetreat all friends equal and leave affinity determination as fu-ture work. Truecaller aggregates data from various socialnetworking websites and sometimes provides a link to thepublic profile picture of the victim on Facebook. We ex-tracted Facebook ID from these links to retrieve friends ofthe victim on Facebook.

Extracting friends from victim’s profile is a non-trivialtask, since everyone does not have their friendlist set aspublic. Therefore, we decided to use public sources like vic-tim’s public feed, victim’s public photo albums, and victim’spublic posts on Facebook to obtain friends information [3],assuming users liking / commenting on any of these pub-lic sources are friends of the victim. To validate the above

1We used this phone number only for research purposes andnothing else.

hypothesis, we performed a small experiment to determineif friends obtained from public sources on Facebook are asubset of public friendlist. Even though normal access to-ken from Facebook does not provide these details, we wereable to fetch the information using a never-expiring mo-bile OAuth token obtained from iPhone’s Facebook applica-tion 2. We monitored the data packet sent while launchingFacebook application on our iPhone device and extractedthe authentication token to make further requests.

We collected a random sample of 122,696 Facebook IDsand obtained 95,756 friends from public sources and 80,979friends from public friendlist (see Figure 3). There were only62,574 users for whom we were able to find friends from bothpublic sources and public friendlist. Out of which, we foundthat 42,552 (68%) user-IDs liking and commenting on pub-lic sources were part of victim’s friendlist with more than95% matching rate. As observed in Figure 3, in some casesfriends from public sources were not a complete subset offriends from public friendlist. We obtained 5,881 friendswith 90 - 95% matching, 3,754 friends with 85 - 90% match-ing, and 10,387 friends with less than 85% matching. Thiscould be because some users might have disabled all plat-form applications from accessing their data. In this case,they might not appear anywhere in any Facebook API [5].To launch attacks using friends information, friends can bepicked from public friendlist, if available, else, the attackercan rely on public sources to extract friends. Therefore, weextracted the Facebook ID from the photo URL (using reg-ular expression) obtained from Truecaller JSON response,and obtained public sources using Facebook Graph API tofind friends on Facebook to craft social phishing attack vec-tor. For example, following JSON object was obtained forone of the phone numbers in the dataset –

{"NAME": "XXXXX","NUMBER ": "+91 XX0000000X","COUNTRY ": "India","PHOTO_URL ": "http :// graph.facebook.com/

XXXXXX/picture?width =320& height =320","e-mail": " "}.

The Facebook ID was parsed from PHOTO_URL and used tomake further requests. E-mail addresses for some users werealso available which can be used to target them.

An attacker can utilize the information, as obtained in thisstep and craft attack vectors described in step 4. The at-tacker can craft non-targeted attacks, in case no informationabout the victim is obtained.

Apart from applications like Truecaller and Facebook thatwe explored in this paper, attackers can exploit CNAM (CallerID Name) database 3, a database that is linked to namesof calling number. The service operational in US, providesinformation associated with a landline number. Attackerscan use this to obtain basic information about their targets.This is out of scope of current work.

3.3 Step 3: Formulating Attacker’s Profile andAttack Channel

2We are not sure if this is an additional feature provided byFacebook or a bug in their system. At the time of writingthis paper, we did not find any official Facebook documen-tation about it.3http://www.voip-info.org/wiki/view/CNAM

Figure 3: Relation between friends obtained frompublic sources and public friendlist. Friends frompublic sources are found to be a subset of friendsfrom public friendlist in 68% cases (with more than95% matching).

Once the data is collected about phone number’s owner,the system determines the channel (OTT messaging appli-cations, voice, e-mail, or SMS) to phish the victim. Thisentirely depends on whether the victim is present on thatparticular channel.

OTT messaging applications .If the attacker decides to choose OTT messaging applica-

tions like WhatsApp, Viber, or Snapchat, he needs to en-sure if the victim is using one of these applications. This isachieved by exploiting the address book syncing feature inOTT messaging applications. Once a user registers himselfon these applications, his contacts in the address book areuploaded (automatically, for some applications) to the OTTmessaging applications’ service provider and are matchedagainst the users of the application to find already exist-ing contacts. Only the information about the owners ofthe phone numbers present in the address book is retrieved.Unlike Facebook and Twitter, these applications make nosuggestions / recommendations for people who might be us-ing these OTT messaging applications. While this makes iteasy and convenient for users to discover friends on theseapplications rather than adding them manually, it poses asecurity threat as well as, an attacker can use this to findthe presence or absence of the victim on these applications(i.e., the attack channel). The attacker can himself create aconvincing profile (profile picture and local phone number)on WhatsApp to make sure the victim feels that attacker islegitimate.

Voice.In addition, an attacker can choose voice as an attack

channel to target their victims. Similar to OTT messagingapplications, to target victims on this channel, an attackerneeds to gather relevant information without checking thepresence of the victim beforehand. An attacker can devicea strategy to make himself look like a trusted or legitimateorganization, to gain victim’s trust. Specifically, attackerscan exploit the integrity of information provided by callerID applications like Truecaller [12], Facebook’s Hello [4],and Whitepages Caller ID and Block [17]. These applica-tions are emerging to help in getting additional informationabout the incoming caller. In general, these applications al-

low an individual to register using his / her phone numberand help in identifying the caller by showing the informa-tion (like name) from their respective databases. Caller IDapplications also gather information from social networkingsites to collect more information about the caller. Attack-ers can undermine such an application to gain trust of theirvictims. Specifically, a) attackers can register a phone num-ber (controlled by him / her) as a trusted bank / company/ organization in which a user is interested in or is dealingwith; b) spoof one of the already registered phone numberswith the caller ID applications and call victims such that thecall appears to come from a real entity. We describe each ofthem in detail.

• Fake Registration: An attacker can add spuriousinformation in caller ID applications fairly easily, thuscompromising the integrity of the information providedby them to gain victim’s trust. Associating an iden-tity with a phone number increases the trust of anindividual and likelihood to pick a call. Since callerID applications do not have a mechanism for verifica-tion of the users’ details, and rely on the informationprovided by the user while registering, it is easy for anattacker to abuse this trust. For example, an attackercan register as multiple fake banks on caller ID appli-cations as shown in Figure 4. We only use HDFC Bankas an example, but any name / entity can be used.

Figure 4: Fake registration on Truecaller as HDFC.

For registration, he needs a smartphone device withworking phone connection. It is a manual processwhere a short SMS code is sent from caller ID ap-plications to verify the phone number. Since the num-ber of banks are limited, it is not difficult for the at-tacker to do this manually. Attacker does not knowthe bank of all these victims, therefore, the attackercan target a large user population to achieve a goodsuccess rate. As millions of automated VoIP calls canbe generated at low cost, as shown in the past, at-tacker does not need to worry about picking only thevictims whose banks are known [30]. To make the pro-file look more authentic, fake social media profiles canbe created and linked to caller ID applications whileregistering an account on it. Top five most popular ap-plications (Truecaller, Whitepages caller ID and Block,

Facebook’s Hello, Whoscall-Caller ID and Block, andContactive) with fake registrations as HDFC bank byexploiting caller ID applications has been shown in Ap-pendix.

• Caller ID spoofing: Another trick to deceive vic-tims uses caller ID spoofing which can be carried outby imitating already registered phone numbers or otherphone numbers whose details were uploaded by callerID applications exploiting address book syncing fea-ture. As a user must have entered some details abouthim / her while registering, it makes the attacker lookgenuine, than an unknown phone number flashing onthe screen.

SMS.An attacker can choose SMS as the attack channel to phish

their victims. He needs to gather relevant information aboutthe victim without checking his / her presence on the attackchannel. As the victim does not see any profile of the at-tacker, he can choose using a local phone number to gainvictim’s confidence.

E-mail.Since so many people around the world depend on e-mail,

it is a lucrative channel for attacks. Attackers lure people ingiving away their information or entice them to take someaction. If there is a non-empty field in the JSON objectreceived from the Truecaller, e-mail can be used to phishusers. Attackers can craft these e-mails to look convincing,sending them out to literally millions of people around theworld [8]. As attacker’s profile is visible to the victim, hecan carefully choose an e-mail address and name to convincevictim about his authenticity.

3.4 Step 4: Crafting Attack VectorAfter the attack channel is determined, attacker can craft

appropriate attack vector to phish the victim. We describethe attack vector generation details for each of the attacksbelow.

3.4.1 Social Phishing AttacksAlthough phishing is a social engineering attack, here we

discuss social phishing [35], i.e., how phishing attacks can bebetter targeted by making them appear to be coming froma friend within victim’s own network. Friends’ informationcan be conveniently chosen to gain trust, therefore, the at-tacker uses victim’s name and one of his friend’s information(i.e., friend’s name) to craft the attack vector. This infor-mation is obtained from Facebook, as discussed earlier inthis section.

3.4.2 Whaling attacksWhaling attacks [15] that are directed specifically at senior

executives or other high-profile individuals within a busi-ness, government, or other organization. It uses the sametechnique as above mentioned targeted phishing, vishing, orwhaling attacks but the intended victims are people withhigh influence or high-net-worth individuals. In India, thereis a particular set of phone numbers reserved by mobile op-erators for politicians, bureaucrats, and people willing toinvest large amount of money to get these special phonenumbers. They are called Vanity / VIP / Fancy numbers

and follow a specific pattern [13]. It could be one digit re-peated several times, 99999-xxxxx or xx-8888-xxxx; twodigits, xx-85-85-85-xx; or in different orders, xx-123-123-xx or xx-11-112233. The main advantage of vanity phonenumbers over standard phone numbers is increased memora-bility. Since they are bought at higher price, owners of thesephone numbers can be assumed as people with high influ-ence who likely are more attractive targets for attackers [9].For very special numbers, network providers host auctionsonline where people can purchase these numbers [2]. Us-ing only vanity numbers in the address book, attackers canlaunch whaling attacks that only targets HNIs (High-net-worth individuals) by sending them targeted or non-targetedphishing messages or initiating vishing calls.

3.4.3 Spear Phishing AttacksSpear phishing attacks are directed at specific individuals

or companies. These attacks are crafted using some a-prioriknowledge of either victim’s name, location, or interests tomake it more believable and increase the likelihood of itssuccess. We focus on generating spear phishing attack vec-tors using victim’s name, as obtained from Truecaller.

3.4.4 Non-targeted AttacksNon-targeted attacks are undirected attacks which are

aimed to target as many users as possible. The goal is toreach out to a large audience and not to target a particu-lar individual. Since it only requires the knowledge whetherthe victim is present on the channel, this can be achievedby crafting a non-targeted phishing, vishing, or whaling at-tack, even if no information about the victim is available.With low VoIP calls, non-targeted vishing attacks are cost-effective for attackers.

All the targeted and non-targeted attacks described in thisstep can be launched on either channels viz., OTT messagingapplications, voice, e-mail, or SMS.

4. SCALABILITYTo define scalability, we assume that an attacker starts

with no information about its potential targets. The attackmethod’s scalability can be characterized by the fraction ofpeople who can be reached over an attack channel, and tar-geted attacks can be launched against them. To demonstratethe scalability of our attacks, we enumerated through a listof 1,162,696 random Indian phone numbers. Since thesenumbers are chosen randomly, no additional information isavailable about their owners at the beginniing. We demon-strate the scale at which each of the proposed attacks canbe carried out with the techniques described earlier.

4.1 Phishing AttacksWe forged the address book of an Android device by in-

serting all these phone numbers in multiple phases. The nextstep was to collect attributes associated with the owner ofthe phone number (victim). Truecaller (TC) was used tocollect more information about the victims. Detailed infor-mation for 722,696 (62%) users was collected using True-caller; name was obtained for all the users as shown inFigure 5. For rest of the users whose information cannotbe obtained from Truecaller, non-targeted phishing attackscan be launched against them. To craft more targeted andpersonalized attacks, i.e., social phishing attacks, friends in-formation was leveraged from Facebook (FB). Social circle

information was obtained for 114,161 (93%) out of 122,696users; 80,979 from public friendlist and 33,182 friends frompublic sources. To check the presence of these numbers onan attack channel, they were synced with WhatsApp ap-plication (WA) using address book syncing feature. About51,409 users were present on WA. Social phishing attackscan be launched against these users whereas spear phishingattacks can be launched against other 180,000 users whosesocial circle was not obtained, but were present on WA.Numbers which were not found on WhatsApp either maynot be allocated to any user or may not be registered on it.

Spear phishing attacks can be launched against 600,000users on voice or SMS. In addition, 122,696 users can besocial phished on voice or SMS. E-mail address for 81,389users were obtained from Truecaller; 13,754 can be socialphished and 67,635 can be spear phished on voice or SMS.

Figure 5: Data collection to demonstrate scalabil-ity of phishing attacks of the system choosing OTTmessaging applications as the attack channel, WA–WhatsApp, TC–Truecaller, FB–Facebook.

4.2 Vishing AttacksPersonal information for 722,696 users was found on True-

caller against 1,162,696 phone numbers searched. Vishingattacks can be crafted against the owners of these phonenumbers. We could extract Facebook ID’s for 122,696 users.Using Facebook Graph API, we obtained following detailsfor these users: gender (112,880), relationship status (57,755),work details (92,352), school information (110,426), employerdetails (106,746), birthday (9,728), and hometown (80,979).The collated information can be used to increase the successrate of targeted vishing attacks.

4.3 Whaling AttacksAs owners of vanity numbers might belong to elite mem-

bers of the society, they can be of particular interest to at-tackers. We looped through the “patterns” available from ane-auction website to enumerate vanity numbers pool [2]. Weinitialized our smartphone’s address book with 171,323 van-ity numbers. We found 91,487 vanity numbers on Truecallerand 11,286 on Facebook. They were synced with WhatsAppand 5,756 (51%) were found on it. Out of 11,286 vanitynumbers that were found on Truecaller and Facebook; weobtained personal information (using Facebook) about own-ers as follows: gender (10,246), relationship status (3,733),birthday (726), work details (6,729), school details (10,994),employer details (9,801), and hometown (6,952). We manu-

ally analyzed Facebook profiles of 100 random vanity num-ber owners to find their occupation details and found di-rector / CEO / chairman (10), student (10), engineer (12),consultants (2), business (5), accountant / officer (8), lec-turer (5), manager (8), bank officials (12) for 70 user pro-files.

Whaling attacks with social information was obtained for11,286 users which can be attacked on voice or SMS. How-ever, only name was obtained for 80,201 users, using True-caller, who can be made targets on voice or SMS. E-mailaddress for 11,013 users was obtained; 1,354 users with so-cial information. Attacks can be made more targeted andpersonalized against these users.

5. ETHICAL AND LEGAL CONSIDERATIONSCrawling data is an ethically sensitive area. We did the

data collection just to demonstrate the feasibility and scal-ability of targeted attacks by abusing a phone number. Thegoal of this work was not to collect personal informationabout individuals, but to explore how such applications canbe abused to collect personal information. As a proof ofconcept, we collected information about owners of randomphone numbers ensuring that the collected information isnot made available to any other organization or individual.All the conducted experiments were approved by the Insti-tutional Review Board (IRB) of our institution. Data col-lected from the participants was anonymized and protectedaccording to the procedures described in the correspondingIRB submission documents. At the end of our experimentsand analysis, phone numbers of all the profiles were delinkedto maintain privacy and confidentiality of data. Phone num-bers and Facebook IDs were hashed to preserve privacy. Wecollected only the public information available on Facebookusing it’s Graph API.

6. STUDY DESIGNIn this paper, we only focus on demonstrating the effec-

tiveness of phishing attacks on OTT messaging applications.Thus, we do not address effectiveness of the other attacksand channels as demonstrated in previous sections.

To demonstrate the effectiveness of phishing attacks onOTT messaging applications, we chose to conduct an onlineroleplay user study on Amazon Mechanical Turk (MTurk).We chose this over a real-world study because conductinga real-world study would involve user deception. To launchattacks as proposed in this paper would require us to gatherpersonally identifiable information (phone number) that wouldnot be possible without deception. Deception studies chal-lenge the principle of respect for person, and are generallynot preferred in the research community due to unforesee-able consequences [28, 36]. Debriefing, unless carried outface-to-face between researcher and the subject, remains achallenge as it is almost impossible to replicate it in onlinephishing studies. A study that launched actual phishing at-tacks without informing their users reported that users gotinfuriated with the attacks after they were debriefed [35].Owing to these constraints and limitations, we chose to val-idate the effectiveness of our attacks using a roleplay userstudy. The benefit of the roleplay is that it enables re-searchers to study effect of phishing attacks without con-ducting an actual phishing attack. It has been shown thatsuch roleplay tasks have good internal and external valid-

ity [29, 41, 47]. The study was built using Javascript frame-work and not based on standard self-reporting data collec-tion techniques like survey and questionnaires.

Participation was restricted to those who were above 18years of age and had been using WhatsApp on regular ba-sis. There was no restriction on the participants belong-ing to a particular region or country. All participants whocompleted the experiment were paid $0.30. Our experimentconsists of two phases, a) Briefing: to ensure that partici-pants have concrete information and clear role description.b) The Play: to assess participants susceptibility to phish-ing attacks. Participants were exposed to one of the threephishing attack vectors: non-targeted (e1), spear (e2), andsocial (e3).

6.1 BriefingSusceptibility to phishing attacks was measured with re-

sponse to a roleplay task which was built on Javascript andshowed multiple screens as WhatsApp screenshots. Thisphase of the study was common across all participants tofamiliarize them with the real-world scenario. A hypotheti-cal situation for the participant to assume himself as“Dave”,and has friends “Alice”, “Bob” and “Charlie”. We used Face-book as a medium to bootstrap this and to let the partici-pant familiarize with the roles. An attacker can extract thisinformation from Facebook and create a social phishing at-tack vector to phish the victim, which is modeled in this step.Next screen shows the registration on WhatsApp, as“Dave”,using a random phone number. The aim is to make the par-ticipant understand that the user (i.e., Dave) has an accounton WhatsApp. Once registered, WhatsApp syncs Dave’saddress book and found only Charlie. Note that other Face-book friends of Dave (i.e., Alice and Bob) were not presenton WhatsApp. There could be two reasons, either Dave doesnot have Alice’s and Bob’s number in his smartphone or Al-ice and Bob do not have an account on WhatsApp. Theprimary idea is to introduce Charlie as the only friend whois present on WhatsApp and other friends (Alice and Bob)are not present on WhatsApp. This is to mimic the attackmodel when an attacker might not know which of victim’sfriend is present on WhatsApp. Therefore, the success of theattack should be independent of this knowledge; who pro-vided wrong answers to any of the following two questionsasked, “Who was your friend on Facebook?”, and “Who wasyour friend on WhatsApp?”. Since correct answers to thesequestions were provided during course of the experiment,participants who did not provide correct answers were fil-tered out.

6.2 The PlayIn this phase of the user study, participants were exposed

to one of the three cases of phishing: non-targeted, spear,and social. Each of the case was randomly assigned to theparticipants. In all three cases, the legitimate case wasshown to the participant to ensure that his responses wereas expected as in a real-world scenario i.e., given a messagem and a trust function T ,

T (m) from known no. ≥ T (m) from an unknown no. (1)

The order of phishing and legitimate messages was ran-domized to avoid learning bias during the course of the ex-periment. Some participants were removed from the analysisdue to unexpected behavior as discussed in Section 6.3. At

the end of each WhatsApp message shown to the partici-pant, the participant was asked the following question andcorresponding options: “What would you like to do with themessage?” with following options: click, reply, delete, or donothing. Now we describe the three experiments to test thesuccess of phishing attacks. Since the names and messagecontent were kept same in all the three experiments, we donot foresee any bias.

e1: Testing non-targeted phishing attack’s success.Non-targeted phishing is defined as an attack scenario

where no additional information about the victim is knownbeforehand, except the phone number. In the play phase,participants were exposed to two scenarios in a random or-der to avoid learning bias; probably phishing message (G#1)and legitimate message (#1) . In G#1, the sender is a randomphone number, whereas, in #1, the message is from Dave’sfriend Charlie. The former scenario is probable phishing be-cause from Dave’s perspective, the sender could be one ofhis friends who is not present in his WhatsApp contacts.However, the latter case is legitimate because Charlie wasalready in Dave’s address book, as mentioned during thebriefing phase (see Section 6.1).

e2: Testing spear phishing attack’s success.Spear phishing is defined as an attack where some infor-

mation about the victim is known beforehand, in additionto his phone number. In our experiment, name of the par-ticipant was “Dave”, as described in the briefing phase. Thisinformation was used to craft a spear phishing attack vector.Similar to non-targeted phishing, participants were exposedto two scenarios; probably phishing message (G#2) and legit-imate message (#2). In G#2, the sender is a random phonenumber, whereas in #2, the message appears to be comingfrom the friend Charlie. However, with one notable differ-ence, that the name of the victim (Dave) was added to themessage to make it more personalized as compared to e1.

e3: Testing social phishing attack’s success.Social phishing is defined as an attack where social infor-

mation (friends, acquaintances, colleagues, etc.) associatedwith the victim is gathered, in addition to known basic infor-mation about the victim (name and phone number). In thispart of the experiment, participants were exposed to threescenarios (as compared to two in e1 and e2) in a randomorder; probably phishing message (G#3), legitimate message(#3), and phishing message ( 3). In G#3, the sender is arandom phone number, however, mentioning the name of“Alice” (one of Dave’s friend on Facebook but not in theWhatsApp contacts, see Section 6.1). From Dave’s perspec-tive, this could probably be a legitimate message becauseAlice is not in Dave’s address book and plausibly in real-world scenario Alice is trying to initiate a conversation withDave. On the other hand, this could be a phishing mes-sage, because friend’s name (Alice) could be forged and anattacker could imitate Alice and send a message to Dave. In#3, the message appears to be coming from the friend Char-lie and in 3, the message is coming from a random phonenumber having Charlie as the friend’s name. Since, Charlieis anyways a friend of Dave on WhatsApp, this is definitelya phishing attack because the sender phone number shouldhave shown Charlie and not a random number.

6.3 Effectiveness of AttacksIn this section, we demonstrate the effectiveness of phish-

ing attacks performed during our user study. In total, 460participants completed the entire user study, out of which129 participants were filtered out based on answers to twoquestions asked from the participants during the briefingphase (see Section 6.1). We present the results based on re-maining 331 participants. We used Kruskal Wallis and MannWhitney statistical tests to check the behavior of populationsubject to different order of messages shown to them. Wedid not find statistical difference between the random groupsin falling victims to all three kinds of phishing attacks.

Table 1 summarizes the results obtained in three exper-iment scenarios e1, e2 and e3 as mentioned in Section 6.2.In this work, we define the success of a phishing attack ifeither a participant decided to click the link or reply to themessage, whereas, attack is unsuccessful if the participantdecided to delete or do nothing about the message. Notethat these are potential victims who may fall for phishing at-tacks. Actual phishing attack happens when the participantgoes through all the steps in the phishing attack i.e. by pro-viding his / her personal details like credit card information,passwords etc. on the phishing web page. However, previ-ous studies have established that a very high percentage ofparticipants who click on the link continue to provide infor-mation to the phishing websites [40, 41, 47]. We believe thatusers who choose to reply to the message are potential vic-tims too, as the attacker can verify active usage of the phonenumber. Also, attacker can lure the victim to give out per-sonal information in subsequent messages. Extra cautioususers would have preferred to either delete / do nothing withthe message received. We denote,

fG#,

f#,

f =⇒ clicked / replied to the message, and

G#,#, =⇒ deleted / did nothing about the message

For example,fG# means participant chose to click or reply

to a probably phishing message, while G# means participantchose to delete or do nothing about the probably phishingmessage. We remove those participants from our furtheranalysis who chose to click on phishing / vulnerable messagebut not on the legitimate message. Because according toequation 1, T (#) ≥ T ( ) and T (#) ≥ T (G#). We denotethese participants as Unknown (see Table 1).

Table 1: Possible outcomes of phishing attacks foreach experiment. (G# → probably phishing message,

# → legitimate message, → phishing message,f#

→ click / reply to a legitimate message, # → delete/ do nothing to a legitimate message).

Case Vulnerable Cautious Unknown

e1

fG#

f# (37) G#

f# (24)

fG# # (7)

G# # (46)

e2

fG#

f# (56) G#

f# (19)

fG# # (4)

G# # (28)

e3

fG#

f#

f (54) G#

f# (12)

fG# # (2)

fG#

f# (9) G# # (20) G# #

f (1)

G#f#

f (9)

fG# #

f (3)

We denote those participants who chose to click / reply

(a) G#1 (b) #1 (c) G#2 (d) G#3

Figure 6: Three phishing attack scenarios shown to the participants (here, we only show the trimmed versionof the actual screenshots). The top left-hand corner shows the sender’s phone number (or name in case thesender’s phone number is in WhatsApp contacts).

on either phishing ( ) or probably phishing messages (G#)or both as Vulnerable (i.e. falling for phishing attacks). Allother participants were part of Cautious group, i.e., whochose to delete or do nothing about both phishing ( ) andprobably phishing messages (G#).

We define the success rate of phishing attack as:

Success(%) =V ulnerable

V ulnerable + Cautious∗ 100

In total, we have 314 out of 331 participants who wereeither vulnerable or cautious. We rest our analysis on these314 participants. We found that phishing attacks on OTTmessaging applications were successful as, e1 = 34.5% (37out 107), e2 = 54.3% (56 out of 103), and e3 = 69.2% (72out of 104). This is consistent with prior work that socialphishing is the most effective out of the three. Furthermore,in social phishing as observed from Table 1, equal numberof participants (63 = 54+9) fell for phishing when the namementioned in the message text was Charlie (i.e., it is com-ing from a friend who is in Dave’s WhatsApp contacts) andwhen the name was Alice (i.e., it is coming from a friend whois not in Dave’s WhatsApp contacts). This shows that in-cluding friend’s name in the message (irrespective of whetherthe friend is present or absent on WhatsApp) increases thesuccess rate of phishing attacks. We repeated the analy-sis for random 25%, 50%, and 75% of the total participantpopulation and found the success results to be consistent.Hence, we can establish that the participant pool size is suf-ficient for our analysis.

6.4 LimitationsThere are a few limitations to the current study. First, the

sample was drawn from MTurk users and is not expected tobe a complete representative of people using OTT messag-ing applications. Our sample of MTurk users tend to beyounger, more educated, and more tech-savvy than the gen-eral public. A second limitation of this study is the lack ofdirect consequences for user behavior. Participants might bemore willing to engage in risky behavior in this roleplay ifthey feel immune to any negative outcomes that may follow.However, the factors used to determine phishing susceptibil-ity would not differ as observed in the real-world behavior.

7. MITIGATING RISKSBased on the exploits we found as described in Section 3.4,

we present recommendations on how to alleviate (if not elim-

inate) the security risks created by these exploits. We com-municated with a number of service providers highlightingthe security issues; they considered it to be a serious problemand ensured prompt redressal. To combat the abuse, therehave been services in place, for instance, CNAM lookupdatabases 4 for landlines numbers. Also, initiatives like Se-cure Telephone Identity Revisited (STIR) working group 5

aim to provide a more secure telephony identity by limitingthe ability to spoof phone Caller ID. However, this relied onPKI that is currently not implemented. Recently, some ser-vices have put some defensive measures in place. WhatsAppincorporated spam blocker feature as a first step in this di-rection [16], though their effectiveness need to be studied.Facebook patched the mobile OAuth token which enabled usto obtain personal public information, which was otherwiseunavailable via Graph API.

7.1 OTT Messaging ApplicationsGiven the plausibility of phishing attacks on OTT messag-

ing applications, this medium needs to be better defended.OTT messaging applications can put certain checks; restrictaddress book sync features, where people can be added onlybased on requests (like Facebook). End-to-end encryptionposes a major challenge to identify a phishing message atzero hour. In order to effectively defend against phish-ing message, one solution could be assigning crowd-sourcedscore (phishing) to the source of a phishing message. OTTmessaging applications can filter messages with high phish-ing score. However, introducing noise in the dataset, re-mains a challenge. To this end, we are currently investi-gating a defense from our side. It involves building crossplatform intelligence from existing Internet infrastructureto associate history with a given phone number which canhep in modeling potentially bad phone numbers.

7.2 Caller ID ApplicationsThere is also a necessity to ensure the integrity of the

information provided by caller ID applications, as peoplerely heavily on them to know about the incoming call andtrust the information provided by these services.

Verification.One of the biggest challenge that caller ID applications

have to face is to implement verification of the information

4http://www.voip-info.org/wiki/view/CNAM5https://datatracker.ietf.org/wg/stir/charter/

provided by users. Currently, at the time of registration,only phone numbers are verified, and neither the entity be-hind these phone numbers nor the details of owners of thesephone numbers is verified. Caller ID applications can checkthe integrity of specific business organizations with appro-priate authority, listing them as verified users, and routinelyscan for any malicious activity in these accounts. Similartechniques can be implemented by caller ID applicationsto maintain the integrity of the information stored in theirdatabases.

Additional Information About Callers.Additional information can be provided about the caller /

the owner of the phone number. For instance, applicationscan record the timestamp when the account was registeredand call frequency patterns. These details can be providedto the user so that he / she can make an informed deci-sion about the caller. In addition, social information aboutthe caller can be displayed, like number of mutual friends,presence on social networks etc. Caller ID applications candesign several metrics like social rank based on the informa-tion aggregated across social networks. If the same nameappears across multiple networks and the user is found tobe active, he / she can be assigned a higher score than apassive user.

Delinking information.Caller ID applications like Truecaller serve as a reservoir of

information, by collating information from multiple sources.The data from different sources should not be aggregatedand stored at the same place, it serves as a goldmine reser-voir. Some parts of the information can be even encryptedto ensure unnecessary information leakage.

8. CONCLUSIONOTT messaging and Voice over IP applications are gain-

ing popularity worldwide. These applications have millionsof registered users. As much as these applications attractusers, spammers find them attractive as well. In this paper,we demonstrated the feasibility, automation, scalability oftargeted phishing, vishing, and whaling attacks by abusingphone numbers. We investigated how easy it would be fora potential attacker to launch automated targeted and non-targeted attacks on different channels viz., OTT messagingapplications, voice, e-mail, or SMS. We presented a novel,scalable system which takes a phone number as an input,leverages information from Truecaller (to obtain vicitm’sdetails) and Facebook (to obtain social circle), checks forthe presence of phone number’s owner on the attack chan-nel (OTT messaging applications, voice, e-mail, or SMS),and finally targets the victim. We collected informationfor 1,162,696 Indian phone numbers and show how non-targeted and targeted phishing, vishing, and whaling attackscan be crafted against the owners of these phone numbersby exploiting cross-application features. Social and spearphishing attacks can be launched against 51,409 and 180,000users respectively. Vishing attacks can be launched against722,696 users. We also found 91,487 highly influential vic-tims who can be attacked by crafting whaling attacks. Toevaluate the effectiveness of one of our attacks, phishingattacks, we conducted an online roleplay study with 314participants on Amazon MTurk. Our results show that so-

cial phishing (69.2%) is the most successful phishing attackon OTT messaging applications, followed by spear (54.3%),and non-targeted phishing (35.5%). To mitigate the attacksdemonstrated in this paper, we also suggest some recom-mendations for OTT messaging applications and caller IDapplications which can prevent their users falling prey totargeted attacks.

9. REFERENCES[1] 91 Leading Social Networks Worldwide.

http://www.practicalecommerce.com/articles/

86264-91-Leading-Social-Networks-Worldwide.

[2] BSNL auction for Vanity Numbers. http://eauction.bsnl.co.in/auction1/index.aspx?id=74.

[3] Facebook Graph API.https://developers.facebook.com/docs/graph-api.

[4] Facebook Hello. http://www.engadget.com/2015/04/22/facebook-hello/.

[5] Fetching friends from Graph API.http://stackoverflow.com/questions/11135053/

fetching-list-of-friends-in-graph-api-or-fql-

appears-to-be-missing-some-friend.

[6] Four Of The Top Six Social Networks Are ActuallyChat Apps. http://marketingland.com/four-top-six-social-networks-actually-chat-apps-115168.

[7] HeadsUp for WhatsApp.http://www.adaptivemobile.com/blog/headsup-

for-whatsapp.

[8] How to send 5 million spam emails without evennoticing. https://nakedsecurity.sophos.com/2014/08/05/how-to-send-5-million-spam-emails/.

[9] Price for Vanity Numbers.http://articles.economictimes.indiatimes.com/

2007-10-13/news/27675454_1_digit-numbers-

mukul-khanna-minimum-price.

[10] SIM card cost, India. http://prepaid-data-sim-card.wikia.com/wiki/India.

[11] SMS Phishing.https://en.wikipedia.org/wiki/SMS_phishing.

[12] Truecaller. http://truecaller.com/.

[13] Vanity Numbers.http://www.openthemagazine.com/article/real-

india/calling-9999999999.

[14] Voice Phishing.https://en.wikipedia.org/wiki/Voice_phishing.

[15] Whaling? These Scammers Target Big Phish.http://www.scambusters.org/whaling.html.

[16] WhatsApp Spam Block Feature.http://www.ibtimes.co.uk/whatsapp-rolls-out-

new-spam-blocker-feature-1497715.

[17] Whitepages Caller ID and Block.http://www.whitepages.com/caller-id.

[18] S. Antonatos, I. Polakis, T. Petsas, and E. P.Markatos. A systematic characterization of IM threatsusing honeypots. In Proceedings of the 17th AnnualNetwork and Distributed System Security Symposium(NDSS), San Diego, CA, 2010.

[19] V. Balasubramaniyan, M. Ahamad, and H. Park.CallRank: Combating SPIT Using Call Duration,Social Networks and Global Reputation. InConference on Email and Anti-Spam, CEAS, 2007.

[20] V. A. Balasubramaniyan, A. Poonawalla, M. Ahamad,M. T. Hunter, and P. Traynor. PinDr0p: usingsingle-ended audio features to determine callprovenance. In Proceedings of the 17th ACMconference on Computer and communications security,pages 109–120. ACM, 2010.

[21] M. Balduzzi, C. Platzer, T. Holz, E. Kirda,D. Balzarotti, and C. Kruegel. Abusing SocialNetworks for Automated User Profiling. In RecentAdvances in Intrusion Detection, pages 422–441.Springer, 2010.

[22] L. Bilge, T. Strufe, D. Balzarotti, and E. Kirda. AllYour Contacts Are Belong to Us: Automated IdentityTheft Attacks on Social Networks. In Proceedings ofthe 18th International Conference on World WideWeb, WWW ’09, pages 551–560, 2009.

[23] Y. Boshmaf, I. Muslukhov, K. Beznosov, andM. Ripeanu. Design and analysis of a social botnet.Computer Networks, 57(2):556–578, 2013.

[24] Y. Cai. Phonebook use to filter unwantedtelecommunications calls and messages, Dec. 21 2005.US Patent App. 11/314,108.

[25] Y. Cheng, L. Ying, S. Jiao, P. Su, and D. Feng. BindYour Phone Number with Caution: Automated UserProfiling Through Address Book Matching onSmartphone. In Proceedings of the 8th ACM SIGSACsymposium on Information, computer andcommunications security, pages 335–340. ACM, 2013.

[26] S. Chiappetta, C. Mazzariello, R. Presta, and S. P.Romano. An anomaly-based approach to the analysisof the social behavior of VoIP users. ComputerNetworks, pages 1545–1559, 2013.

[27] R. Dantu and P. Kolan. Detecting spam in voipnetworks. In Steps to Reducing Unwanted Traffic onthe Internet Workshop, SRUTI’05. USENIXAssociation, 2005.

[28] E. Department of Health et al. The belmont report.ethical principles and guidelines for the protection ofhuman subjects of research. The Journal of theAmerican College of Dentists, 81(3):4, 2014.

[29] J. S. Downs, M. B. Holbrook, and L. F. Cranor.Decision strategies and susceptibility to phishing. InProceedings of the second symposium on Usableprivacy and security, pages 79–90. ACM, 2006.

[30] S. E. Griffin and C. C. Rackley. Vishing. InProceedings of the 5th annual conference onInformation security curriculum development, pages33–35. ACM, 2008.

[31] P. Gupta, M. Ahamad, J. Curtis,V. Balasubramaniyan, and A. Bobotek. M3AAWGTelephony Honeypots: Benefits and DeploymentOptions. Technical report, 2014.

[32] P. Gupta, S. Gottipati, J. Jiang, and D. Gao. YourLove is Public Now: Questioning the Use of PersonalInformation in Authentication. In Proceedings of the8th ACM SIGSAC Symposium on Information,Computer and Communications Security, ASIA CCS’13, pages 49–60. ACM.

[33] P. Gupta, B. Srinivasan, V. Balasubramaniyan, andM. Ahamad. Phoneypot: Data-driven Understandingof Telephony Threats. In 22nd Annual Network andDistributed System Security Symposium, NDSS 2015,

2015.

[34] M. Huber, M. Mulazzani, E. Weippl, G. Kitzler, andS. Goluch. Friend-in-the-middle attacks: Exploitingsocial networking sites for spam. Internet Computing,IEEE, 15(3):28–34, 2011.

[35] T. N. Jagatic, N. A. Johnson, M. Jakobsson, andF. Menczer. Social Phishing. Communications of theACM, 50(10):94–100, 2007.

[36] M. Jakobsson and J. Ratkiewicz. Designing ethicalphishing experiments: a study of (rot13) ronl queryfeatures. In Proceedings of the 15th internationalconference on World Wide Web, pages 513–522. ACM,2006.

[37] M. Jakobsson and A. L. Young. Distributed PhishingAttacks. IACR Cryptology ePrint Archive, 2005:91,2005.

[38] E. Kim, K. Park, H. Kim, and J. Song. I’ve Got YourNumber: Harvesting users’ personal data via contactssync for the KakaoTalk messenger. In InformationSecurity Applications: 15th International Workshop,WISA 2014, Jeju Island, Korea, August 25-27, 2014.

[39] H.-J. Kim, M. J. Kim, Y. Kim, and H. C. Jeong.DEVS-based modeling of VoIP spam callers’ behaviorfor SPIT level calculation. Simulation ModellingPractice and Theory, 17(4):569–584, 2009.

[40] P. Kumaraguru, J. Cranshaw, A. Acquisti, L. Cranor,J. Hong, M. A. Blair, and T. Pham. School of phish:A Real-World Evaluation of Anti-Phishing Training.In Proceedings of the 5th Symposium on UsablePrivacy and Security, page 3. ACM, 2009.

[41] P. Kumaraguru, S. Sheng, A. Acquisti, L. F. Cranor,and J. Hong. Teaching Johnny Not to Fall for Phish.ACM Transactions on Internet Technology (TOIT),10(2):7, 2010.

[42] S. Kurowski. Using a whatsapp vulnerability forprofiling individuals. Open IdentitySummit,GI-Edition - Lecture Notes in Informatics(LNI) - Proceedings 237, pages 140–146, 2014.

[43] H. Mustafa, W. Xu, A. R. Sadeghi, and S. Schulz. YouCan Call but You Can’t Hide: Detecting Caller IDSpoofing Attacks. In Dependable Systems andNetworks (DSN), 2014 44th Annual IEEE/IFIPInternational Conference on, pages 168–179.

[44] J. Quittek, S. Niccolini, S. Tartarelli, M. Stiemerling,M. Brunner, and T. Ewald. Detecting SPIT calls bychecking human communication patterns. InCommunications, 2007. ICC’07. IEEE InternationalConference on, pages 1979–1984.

[45] M. Scat’a and A. La Corte. User-profile frameworkagainst a spit attack on VoIP systems. InternationalJournal for Information Security Research, pages121–131, 2011.

[46] S. Schrittwieser, P. Fruhwirt, P. Kieseberg,M. Leithner, M. Mulazzani, M. Huber, and E. R.Weippl. Guess Who’s Texting You? Evaluating theSecurity of Smartphone Messaging Applications. In19th Annual Network and Distributed System SecuritySymposium, NDSS 2012.

[47] S. Sheng, M. Holbrook, P. Kumaraguru, L. F. Cranor,and J. Downs. Who Falls for Phish? A DemographicAnalysis of Phishing Susceptibility and Effectivenessof Interventions. In Proceedings of the 28th

international conference on Human factors incomputing systems, CHI ’10.

[48] J. Song, H. Kim, and A. Gkelias. iVisher: Real-TimeDetection of Caller ID Spoofing. ETRI Journal,36(5):865–875, 2014.

[49] Y.-S. Wu, S. Bagchi, N. Singh, and R. Wita. Spamdetection in voice-over-ip calls through

semi-supervised clustering. In Dependable Systems &Networks, 2009. DSN’09., pages 307–316. IEEE, 2009.

[50] E. Zheleva and L. Getoor. Privacy in social networks:A survey. In Social network data analytics, pages277–306. Springer, 2011.

10. APPENDIX

(a) Truecaller (b) Whitepages CallerID & Block

(c) Facebook’s Hello (d) Whoscall-Caller IDand Block

(e) Contactive

Figure 7: Incoming call showing fake HDFC bank (example in our case) on various caller ID applications.