![Page 1: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/1.jpg)
R AP
Pattern Recognition and Applications Group Department of Electrical and Electronic Engineering University of Cagliari, Italy
Machine Learning in Computer
Forensics (and the Lessons Learned from Machine Learning in
Computer Security)
D. Ariu G. Giacinto F. Roli
PRA Pattern Recognition and Applications Group
AISEC
4° Workshop on Artificial Intelligence and Security
Chicago – October 21, 2011
![Page 2: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/2.jpg)
What can be analyzed… (during an investigation)
October 21 - 2011 Davide Ariu - AISEC 2011 2
![Page 3: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/3.jpg)
Role of Computer Forensics (with respect to Computer Security)
October 21 - 2011 Davide Ariu - AISEC 2011 3
Prevention Security
Detection Security
(live) Forensics
Truth Assessment Forensics
Cyber Attack (or Crime) Progress
![Page 4: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/4.jpg)
October 21 - 2011 Davide Ariu - AISEC 2011 4
Goals
• To provide a small snapshot of ML research
applied to Computer Forensics
• To clarify the ML approach to Computer Forensics
![Page 5: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/5.jpg)
Historical Perspective
October 21 - 2011 Davide Ariu - AISEC 2011 5
Computer Security Computer Forensics
•Early ’70s – First Computer Security
research research papers appear
•1988 - The first known internet-
wide attack occur (the “Morris Worm”)
•Early 2000 - Slammer and his friend
in the wild: consequent security issues are on tv channels and
newspapers
•1984 – The FBI Laboratory began
developing programs to examine computer evidence
•1993 – International Law Enforcement Conference on
Computer Evidence
•1999-2007 – Computer Forensics “Golden Age” [Garfinkel,2010]
![Page 6: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/6.jpg)
Computer Security Research
• Strong Research Community
– Research groups and centers exist (almost) worldwide
• Well defined main research directions
– Malware and Botnet analysis and detection
– Web Applications Security
– Intrusion Detection
– Cloud Computing
• Well defined methodologies
– Research results can have an immediate practical impact
October 21 - 2011 Davide Ariu - AISEC 2011 6
![Page 7: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/7.jpg)
Computer Forensics Research
• Not particularly strong research community (at
least in terms of results achieved)
– Mostly people with a computer security background (as me..)
• Not well defined research directions
• Not well defined approaches and methods
– Difficulty to reproduce digital forensics research
results [Garfinkel, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 7
![Page 8: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/8.jpg)
How can machine learning be useful in Computer Forensics?
• “Machine Learning methods are the best
methods in applications that are too complex for
people to manually design the algorithm” [Mitchell,2006]
• The “reasoning” is a fundamental step during the
investigation
– Computer forensics is conceptually different from Intrusion Detection
• The huge mass of data to be analyzed (TB scale)
makes intelligent analysis methods necessary
– Situations also exist where there is no time for an in-
depth analysis (e.g. Battlefield Forensics)
October 21 - 2011 Davide Ariu - AISEC 2011 8
![Page 9: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/9.jpg)
ML applications to CF
• Applications of Machine Learning techniques
have been proposed in several Computer
Forensics applications
– Textual Documents and E-mail forensics
– Network Forensics
– Events and System Data Analysis
– Automatic file (fragment) classification
October 21 - 2011 Davide Ariu - AISEC 2011 9
![Page 10: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/10.jpg)
Computer Forensics Research Drawbacks
• The experimental results proposed are not
completely convincing…
– Network forensics solutions evaluated on the DARPA dataset only
– Email forensics algorithms evaluated on a corpus of 156 emails (and 3 different authors)
– Automatic File classification algorithms evaluated on 500MB dataset (best case…)
• In addition, the approach adopted was the same adopted in Computer Security…
October 21 - 2011 Davide Ariu - AISEC 2011 10
![Page 11: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/11.jpg)
How to improve existing tools?
• Useful solutions can be developed only if the
focus is:
– On the investigator and on the knowledge of the case that he has
– On the organizazion and categorization of of the
information provided to the investigator
• Data sorting and categorization
• Prioritisation of results[Garfinkel, 2010; Beebe, 2009]
October 21 - 2011 Davide Ariu - AISEC 2011 11
![Page 12: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/12.jpg)
Putting knowledge into the tool…
• Computer Security tools (e.g. IDS) are based on
a well defined criteria that is used to detect
attacks
• In other contexts where is difficult to explicitely
define a search criteria the feedback provided
by the user is exploited to achieve more
accurate results
– E.g. Content-based Image Retrieval with relevance
feedback [Zhouand,2003]
• It can be definitely the case of Computer
Forensics applications..
October 21 - 2011 Davide Ariu - AISEC 2011 12
![Page 13: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/13.jpg)
Organizing data and results
• Discerning among the huge mass of data
represent a dramatically time-consuming task for
investigators
– E.g. Filtering the results obtained after file carving
– E.g. Inspecting all the pictures found in a laptop
• A tool can be definitely useful even if it is only
able to sort results and contents according to a relevance criteria (most relevant first)
– The tool only assign “scores”, the analyst will inspect
them..
October 21 - 2011 Davide Ariu - AISEC 2011 13
![Page 14: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/14.jpg)
To summarize..
• We investigated the problem of applying ML to
Computer Forensics
• We provided a short overview of the literature
related to ML applications in Computer Forensics
• We proposed several guidelines to profitably
apply machine learning to Computer Forensics
October 21 - 2011 Davide Ariu - AISEC 2011 14
![Page 15: Ariu - Workshop on Artificial Intelligence and Security - 2011](https://reader035.vdocuments.us/reader035/viewer/2022081907/5488747cb479590a0d8b56c5/html5/thumbnails/15.jpg)
Question or Comments
Thank you for your attention!
October 21 - 2011 Davide Ariu - AISEC 2011 15