‘supervised automation’ for malware variant generation: theoretical and practical implications
DESCRIPTION
‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications. Rachit Mathur Research Scientist McAfee. September 5, 2014. 18th EICAR Annual Conference 9 th – 12 th May, 2009 Berlin, Germany. Agenda. Introduction & Malware Growth. Supervised-Automation. - PowerPoint PPT PresentationTRANSCRIPT
April 21, 2023
‘Supervised Automation’ for Malware Variant Generation: Theoretical and Practical Implications
Rachit Mathur Research ScientistMcAfee
18th EICAR Annual Conference9th – 12th May, 2009Berlin, Germany
Agenda
Introduction & Malware Growth
Supervised-Automation
Compare With Metamorphism
Real-World Examples
Detection Challenges
Conclusions & Future work
Questions
Malware Growth – All known samples
+180%
Malware Growth – Families vs Variants
Rogue AV Unique Binaries Discovered
Sample Count Explosion
• Lots of variants per family• New variants released even before a signature for
previous ones gets released• Money-motivated organized malware gangs
– ‘Professional products’– Pose serious detection challenges
• Difficult to anticipate changes• Short-term per family proactive detection is minimum requirement
– Use bleeding-edge technology• Conficker – crypto algorithms• MBR rootkit – stealth techniques• To evade detection is the primary motive
Morphing Malware
• Not the traditional poly or metamorphics• Do not carry the mutator• Delivered through the cloud (server-side)
– Drive-by downloads, social engineering, self-updating malware– Binaries change often
• Now adopted by all– Backdoor, PWS, AdClicker, Proxy, Worms etc
• Morphing services– Tibs-Packed: Storm worm, downloader, uploder, spam-bot,
backdoors etc.– FakeAV looking downloaders, backdoors, worms
• Human supervised automated variant generation system
Supervised-Automation
• Supervised Automation (SA) is semi-automated method of generation of malware variants with sporadic human intervention
• Loosely related to the concept of metamorphism
• Not based off of any particular malware family
Supervised Automation
Malicious binary & info
Release-to-world
Select and apply morphing
Select and apply encryption
Black-Box signature extraction
B
Info
Human
Info
E(B)
M(E(B))
Info
Loop-back to re-encrypt
• ADD
• SUB
• XOR
• ROT
• RC4
• Dead Code Insertion
• Junk Code Insertion
• CFG Obfuscation
• Instruction Substitution
• Decryption Key Obfuscation
• Geometric Fuzzyfication
Supervised-Automation
• Generate any number of new variants at the desired frequency
• Motive is to evade detection and not ‘blindly’ generate variants
• Different pattern of operation observed in Tibs-Packed, FakeAV, GamePWS trojans
SA vs. Metamorphism
• Generally speaking, virus detection is undecidable
• Solutions for specific sub cases have been proposed
• Let us see what existing results from comparable technology apply to SA
• Purely automatic variant generation i.e. the concept of metamorphism is studied
SA vs. Metamorphism
• Do not carry the engine• Transformation logic is not self-contained
• Transformation rules not constant
• No feed-back loop• Transformations not limited
• Anti debugging, anti disassembly, anti emulation : anti analyses
Locate own code
Decode
Analyze
Transform
Metamorphic engine
Normalization based approach
• Transformation rules modelled as Term Rewriting Systems (TRS) and related to formal grammars
• Proving equivalence between two programs w.r.t. a rewriting system reduces to the famous word problem
– Undecidable in general– Unless TRS is confluent and terminating– Some approximation based approaches
mov edi, 0x04
mov eax, 0x04 push eax
push eax mov eax, 0x04
push 0x04
push ecx mov ecx, 0x04 mov edi, ecx pop ecx
push eaxeax not live
unconditional
eax not live
Normalization based approach
RS3RS2
RS1
Time
• Multiple TRS bad news for some solutions•Q: Do multiple TRS really make a difference?
•Same worst case for a ‘well-designed’ system•But multiple TRS does make things worse
Approaches
• Approaches that are agnostic of rule systems can be useful against such systems
• Smart byte-based detection schemes
• Normalization based on general optimization techniques and program semantics based detection methods
• Behaviour based detection may be useful today
• Emulation based techniques have been proposed earlier to identify detectable behaviours but emulation has a host of well known problems
Example – Storm worm
start of encrypted code
end of encrypted code
Fake call returns -1.
Add , rotate
• Locate the start address of encrypted data and size/end of the data • Calculate key(s): key[i] • Apply key(s)• Transfer control to decrypted code
Example – Storm worm
start of encrypted code
end of encrypted code
Fake call returns -1.
Add , rotate
Example – Storm worm
start of encrypted code
end of encrypted code
Fake call returns -1.
Add , rotate
Example – Storm worm
Base Variant (BV)
Algorithm BAlgorithm A Algorithm N…..
EBV1 EBV2 EBV3
M1 M2 Mn M1 M2 Mn M1 M2 Mn……
M11K
M12
M1n
…
K
Algorithm C
K
…
K
K
M11
M12
M1n
M11
M11
M1n
K
K
K
K
Day 1
M21K
M22
M2n
…
K
K
Day 2 …. Day m
K
…K
K
M11
M12
M1n
Day m+1 Day m+2 Day n
…
M11
M11
M1n
K
K
K
…
Day n+1 Day n+2 Day o
…..
…..
…
• High, medium and low frequency changes
Example – DNSChanger
• Uses obfuscated calls
Rules can be conditional
Possible call targets
Example – PWS dll
• Rules change often• Constructs strings
HBXYXND-0109-NEW
Example – PWS dll
• Rules change often• Constructs strings
WM_HOOKEX_RK
Example – PWS dll
• Rules change often• Constructs strings
Explorer.exe
Example – PWS dll
• Rules change often• Constructs strings
act=getpos&account=%s
• junk code• variable renaming• register liveness• second one is reversed
Example – FakeAV
Detection Challenges
• Virus authors want to evade detection, and keep undetected once a machine is compromised
• AV update should detect the ‘current’ vairant – somewhat ‘proactive’
• Able to detect all automatically generated variants up till the next human based update
• Resistant to non-functional changes
Signatures
• Goal is to find ‘enough’ evidence to detect and classify a file for practical purposes such that it will not generate any false positives
– Generic– Reliable : No falses– “my virus botnet, attack ms08-067 ping”
Signatures
• Simple byte sequence based not useful– Hash based– Detection worthy strings– Detection worthy code sequence
• Multiple sets of wildcard based byte sequences at various locations that remain constant
• Emulation• Decryption or cryptanalysis based
– Presence of a technique can yield itself to detection
• Geometry based• Combination provides the right balance
Conclusions & Future Work
• Stakes are getting bigger with increasingly critical, sensitive, high-value information at risk
• Adoption of cutting-edge research concepts and innovation skills by virus authors
• More automation and more understanding of ‘correct’ transformation techniques is expected
• Interesting to formalize some results in the realm of SA based malware
• Detections solutions which are agnostic of rewrite systems need to be investigated.
• It will also be interesting to see how behaviour evolution materializes in reality and any forward looking research around that is very relevant
References
• Bruschi, D., Martignoni, L., & Monga, M. (2006). Detecting Self-mutating Malware Using Control-Flow Graph Matching. Lecture Notes in Computer Science , 4064/2006 (Detection of Intrusions and Malware & Vulnerability Assessment), 129-143.
• Bruschi, D., Martignoni, L., & Monga, M. (2006). Using code normalization for fighting self-mutating malware. International Symposium on Secure Software Engineering. Washington, DC, USA: IEEE.
• Chess, D. M., & White, S. R. (2000). An undetectable computer virus. In Proceedings of Virus Bulletin Conference. • Christodorescu, M., & Jha, S. (2003). Static analysis of executables to detect malicious patterns. SSYM'03: Proceedings of the 12th conference on USENIX
Security Symposium (pp. 12 - 30). USENIX Association.• Christodorescu, M., & Jha, S. (2004). Testing malware detectors. ACM SIGSOFT Software Engineering Notes , 29 (4), 34 - 44.• Christodorescu, M., Jha, S., Seshia, S. A., Song, D., & Bryant, R. E. (2005). Semantics-Aware Malware Detection. IEEE Symposium on Security and
Privacy (pp. 32 - 46). ACM Press.• Filiol, E. (2006). Malware Pattern Scanning Schemes Secure Against Black-box Analysis. Journal in Computer Virology , 35-50.• Filiol, E. (2007). Metamorphism, Formal Grammars and Undecidable Code Mutation. International Journal of Computer Science .• Filiol, E., & Josse, S. (2007). A statistical model for undecidable viral detection. Journal in Computer Virology , 3, 65-74.• Filiol, E., Jacob, G., & Liard, M. L. (2006). Evaluation methodology and theoretical model for antiviral behavioural detection strategies. Journal in Computer
Virology , 23-37.• Kapoor, A., & Mathur, R. (2008, June). STRIKE ME DOWN, AND I SHALL BECOME MORE POWERFUL! VIRUS BULLETIN , pp. 8-10.• Lakhotia, A., Kapoor, A., & Kumar, E. U. (2005, January). Are metamorphic viruses really invincible? - part II. Virus Bulletin , pp. 9-12.• Mathur, R. (2006, December). Normalizing Metamorphic Malware using Term-Rewriting. M.S. Thesis . University of Louisiana at Lafayette.• Mathur, R., & Kapoor, A. (2007, December). Exploring The Evolutionary Patterns Of Tibs-Packed Executables. Virus Bulletin , pp. 6-9.• Soeder, D., & Permeh, R. (2005). BootRoot. Retrieved from eEye: http://research.eeye.com/html/tools/RT20060801-7.html• Szor, P., & Ferrie, P. (2001). Hunting for metamorphic. 11th International Virus Bulletin Conference. • Tan, X. (2007). Anti-unpack Tricks in Malicious Code. AVAR. Seoul.• Walenstein, A., Mathur, R., Chouchane, M. R., & Lakhotia, A. (2008). Constructing malware normalizers using term rewriting. Journal in Computer Virology ,
307-322.• Walenstein, A., Mathur, R., Chouchane, M. R., & Lakhotia, A. (2007). The Design Space of Metamorphic Malware. Proceedings of the 2nd International
Conference on Information Warfare. Monterey, CA, U.S.A.• Webster, M., & Malcolm, G. (2008, July). Detection of metamorphic and virtualization-based malware using algebraic specification. Journal in Computer
Virology .
Thank You! (Danke!)
Suggestions & Questions:Email: [email protected]