domain generation algorithm malware - cert.or.id · pdf filewhat is domain generation...
Post on 21-Mar-2018
224 Views
Preview:
TRANSCRIPT
Domain Generation Algorithm Malware
Domain Generation Algorithm Malware
Enrico Hugo, CFP, CEH
ID-CERT Malware Summit II
13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia
Enrico Hugo, CFP, CEH
ID-CERT Malware Summit II
13 April 2017 | Graha Merah Putih PT Telkom Indonesia | Bandung, Indonesia
About MeEnrico Hugo, CFP, CEHBachelor of Science in Computer Science at Binus International
Ex-IT Security Intern at CBN
enrico.hugo [at] yahoo.co.id
http://www.linkedin.com/enricohugo
I have just finished my undergraduate study in Binus University International IndonesiaInternational Indonesia
Current Research Interests
CommunityIndonesia Honeynet Project - Member
DNS Analysis
Netflow Analysis
Data Mining
Machine Learning
Agenda
• Domain Name System and its threats
• Domain Generation Algorithm
• Environment Setup
• Detecting DGA
• DGA Case Study
• Possible Improvements
• Conclusion
Domain Name Systemand its threatsand its threats
Domain Name System (DNS)• Phonebook system that maps domain
names into IP addresses
• Also supports reverse lookup to search the domain name that corresponds to an IP addressaddress
• Provides caching system
• Has not been upgraded since first release, unlike the case of telnet to ssh or ftp to sftpfor security countermeasures
DNS Threats• DNS cache poisoning
• DNS tunneling
• DNS amplification attack
• Domain Generation Algorithm
• DNS Fast Flux• DNS Fast Flux
• and many more ...
DNS Threats• DNS cache poisoning
• DNS tunneling
• DNS amplification attack
• Domain Generation Algorithm
• DNS Fast Flux• DNS Fast Flux
• and many more ...
Domain Generation AlgorithmAlgorithm
What is Domain Generation Algorithm?
Domain generation algorithms(DGA) are algorithms seen in various families of malware that are used to periodically generate a large number of domain names a large number of domain names that can be used as rendezvous points with their command and control servers.
DGA Characteristics• NXDOMAIN responses
• Usually random on the 2LD or 3LD domains
• A lot of requests from the same IP address
• Ranges from completely unreadable words (not compliant to Zipf’s Law) to dictionary (not compliant to Zipf’s Law) to dictionary words (harder to detect).
Malwares using DGA
• Kraken
• Conficker
• Gameover Zeus
• Pykspa
• Mad Max
• PandaBanker
• Pushdo
• Ramnit
• Cryptolocker
• Dyre
• Darkshell
• Locky
• Srizbi
• Torpig
• Virut
• etc.
Environment Setup
Environment Setup
Environment Setup
Detecting DGA
Detecting DGA - Zipf’s Law
Zipf's law states that given some corpus of natural language utterances, the frequency of any word is inversely proportional to its rank in the frequency table. Thus the most frequent the frequency table. Thus the most frequent word will occur approximately twice as often as the second most frequent word, three times as often as the third most frequent word.
Detecting DGA - Zipf’s Law
Detecting DGA - Zipf’s Law
Detecting DGA - Zipf’s Law
Detecting DGA - Zipf’s Law
Detecting DGA - Zipf’s Law
DGA Monitor
DGA Monitor
Detecting DGA - Hierarchical Clustering
Level 1Level 1
• Query Length
• Numeric Chars
Level 2Level 2
• Unreadable Bigram Ratio
• Consonant-Vowel Ratio
Level 3Level 3• Squared Value of Numeric Chars
5
clusters
2
clusters
2
clustersLevel 3• Squared Value of Numeric Chars
Level 4Level 4
• Maximum Consonant Sequence Length
• Maximum Label Length
Level 5Level 5
• 2LD Frequency Score
• 3LD Frequency Score
clusters
2
clusters
3
clusters
UBRatio and CVRatio
Detecting DGA - Hierarchical Clustering
Level 1Level 1
• Query Length
• Numeric Chars
Level 2Level 2
• Unreadable Bigram Ratio
• Consonant-Vowel Ratio
Level 3Level 3• Squared Value of Numeric Chars
5
clusters
2
clusters
2
clustersLevel 3• Squared Value of Numeric Chars
Level 4Level 4
• Maximum Consonant Sequence Length
• Maximum Label Length
Level 5Level 5
• 2LD Frequency Score
• 3LD Frequency Score
clusters
2
clusters
3
clusters
Maximum Consonant Sequence Length (MCSLen)
• google.com -> 2 characters
• domobhdst.net -> 5 characters
Algorithmically-generated domains tend to have longer Maximum Consonant Sequence
Length (MCSLen).
Detecting DGA - Hierarchical Clustering
Level 1Level 1
• Query Length
• Numeric Chars
Level 2Level 2
• Unreadable Bigram Ratio
• Consonant-Vowel Ratio
Level 3Level 3• Squared Value of Numeric Chars
5
clusters
2
clusters
2
clustersLevel 3• Squared Value of Numeric Chars
Level 4Level 4
• Maximum Consonant Sequence Length
• Maximum Label Length
Level 5Level 5
• 2LD Frequency Score
• 3LD Frequency Score
clusters
2
clusters
3
clusters
Detecting DGA - Hierarchical Clustering
Cluster Descriptions
Clustering Results
Case Study
Case Study – The Discovery of Pykspa Malware2nd of November 2016 8th of November 2016 14th of November 2016
N times shows the number of blocked DNS request (by Palo Alto) from an IP address.
As can be seen, 210.210.150.30 is on all shown lists. Only three days of sample is
shown in this slide, but in fact the IP is on the Top 20 list everyday, which is suspicious.
Case Study - Steps of Detection
• Deploy Dionaea honeypot on same subnet
• Direct SSH access
• List running processes using ps aux
Case Study - Steps of Detection
• See resource consumption using top
Case Study – Steps of Detection
• Find the suspected file location using find
• Upload the files to VirusTotal• Upload the files to VirusTotal
– sujeljlanddrcsuj.exe => KillAV Trojan
– vmqaw.exe => Pykspa Worm
Case Study – Steps of Detection
• Pykspa is said to be spread through Skype, so I searched for Skype and found no running Skype instance, but found a Skype installer file.
• Or ...
Case Study – Proof of Detection
• Johannes Bader (https://johannesbader.ch) did a reverse engineering of the Pykspa worm and figured out its DGA algorithm, consisting of many noisy (camouflage) DGA and some useful (intended) DGA.
• Using his Python script, we get some domain names that will be used by Pykspa in the same day the script is run, as seen in the next slide.
• The script: https://johannesbader.ch/2015/03/the-dga-of-pykspa/dga.zip
Case Study – Proof of Detection
• 10 sample DGA of Pykspa for 15th of November 2016
Case Study – Proof of Detection
Conclusion
Possible Improvements
• Improve DGA Monitor by creating blacklist and whitelist
• Find a method to confirm whether a given domain name is a DGA domaindomain name is a DGA domain
Conclusion
• Blocked does not mean solved.
• Look for NXDOMAIN and SERVFAIL queries when detecting DGA
• It is necessary to be proactive, not reactive, • It is necessary to be proactive, not reactive, by consistently performing Threat Hunting
Join Us
• http://www.ihpcon.id
• Indonesia Honeynet Project
• idhoneynet• idhoneynet
• http://www.honeynet.or.id
• http://groups.google.com/group/id-honeynet
enrico.hugo [at] yahoo.co.id and +62 857 1631 5877
top related