current challenges and future research areas for digital forensic investigation · 2016-05-25 ·...

12
Current Challenges and Future Research CDFSL Proceedings 2016 © 2016 ADFSL Page 9 CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION David Lillis, Brett A. Becker, Tadhg O’Sullivan and Mark Scanlon School of Computer Science, University College Dublin, Ireland {david.lillis, brett.becker, t.osullivan, mark.scanlon}@ucd.ie ABSTRACT Given the ever-increasing prevalence of technology in modern life, there is a corresponding increase in the likelihood of digital devices being pertinent to a criminal investigation or civil litigation. As a direct consequence, the number of investigations requiring digital forensic expertise is resulting in huge digital evidence backlogs being encountered by law enforcement agencies throughout the world. It can be anticipated that the number of cases requiring digital forensic analysis will greatly increase in the future. It is also likely that each case will require the analysis of an increasing number of devices including computers, smartphones, tablets, cloud-based services, Internet of Things devices, wearables, etc. The variety of new digital evidence sources poses new and challenging problems for the digital investigator from an identification, acquisition, storage, and analysis perspective. This paper explores the current challenges contributing to the backlog in digital forensics from a technical standpoint and outlines a number of future research topics that could greatly contribute to a more efficient digital forensic process. Keywords: Digital Evidence Backlog, Digital Forensic Challenges, Future Research Topics INTRODUCTION The early 21st century has seen a dramatic increase in new and ever-evolving technologies available to consumers and industry alike. Generally, the consumer-level user base is now more adept and knowledgeable about what technologies they employ in their day-to-day lives. The number of cases where digital evidence is relevant to an investigation is ever- increasing and it is envisioned that the existing backlog for law enforcement will balloon in the coming years as the prevalence of digital devices increases. It is for these reasons that it is important to take stock of the current state of affairs in the field of digital forensics. Cloud- based services, Internet of Things devices, anti- forensic techniques, distributed and high capacity storage, and the sheer volume and heterogeneity of pertinent devices pose new and challenging problems for the acquisition, storage and analysis of this digital evidence. Due to the sheer volume of data to be acquired, stored, analysed, and reported, combined with the level of expertise necessary to ensure the court admissibility of the resultant evidence, it was inevitable that a significant backlog in cases awaiting analysis would occur (Hitchcock et al., 2016). Three

Upload: others

Post on 03-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 9

CURRENT CHALLENGES AND FUTURERESEARCH AREAS FOR DIGITAL FORENSIC

INVESTIGATIONDavid Lillis, Brett A. Becker, Tadhg O’Sullivan and Mark Scanlon

School of Computer Science,University College Dublin, Ireland

{david.lillis, brett.becker, t.osullivan, mark.scanlon}@ucd.ie

ABSTRACTGiven the ever-increasing prevalence of technology in modern life, there is a correspondingincrease in the likelihood of digital devices being pertinent to a criminal investigation or civillitigation. As a direct consequence, the number of investigations requiring digital forensic expertiseis resulting in huge digital evidence backlogs being encountered by law enforcement agenciesthroughout the world. It can be anticipated that the number of cases requiring digital forensicanalysis will greatly increase in the future. It is also likely that each case will require the analysisof an increasing number of devices including computers, smartphones, tablets, cloud-based services,Internet of Things devices, wearables, etc. The variety of new digital evidence sources poses newand challenging problems for the digital investigator from an identification, acquisition, storage,and analysis perspective. This paper explores the current challenges contributing to the backlog indigital forensics from a technical standpoint and outlines a number of future research topics thatcould greatly contribute to a more efficient digital forensic process.Keywords: Digital Evidence Backlog, Digital Forensic Challenges, Future Research Topics

INTRODUCTIONThe early 21st century has seen a dramaticincrease in new and ever-evolving technologiesavailable to consumers and industry alike.Generally, the consumer-level user base is nowmore adept and knowledgeable about whattechnologies they employ in their day-to-daylives. The number of cases where digitalevidence is relevant to an investigation is ever-increasing and it is envisioned that the existingbacklog for law enforcement will balloon in thecoming years as the prevalence of digitaldevices increases. It is for these reasons that itis important to take stock of the current state

of affairs in the field of digital forensics. Cloud-based services, Internet of Things devices, anti-forensic techniques, distributed and highcapacity storage, and the sheer volume andheterogeneity of pertinent devices pose newand challenging problems for the acquisition,storage and analysis of this digital evidence.

Due to the sheer volume of data to beacquired, stored, analysed, and reported,combined with the level of expertise necessaryto ensure the court admissibility of theresultant evidence, it was inevitable that asignificant backlog in cases awaiting analysiswould occur (Hitchcock et al., 2016). Three

Page 2: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 10 © 2016 ADFSL

particular aspects have contributed to thisbacklog (Quick and Choo, 2014):

1. An increase in the number of devicesthat are seized for analysis per case.

2. The number of cases whereby digitalevidence is deemed pertinent is ever-increasing.

3. The volume of potentially evidence-richdata stored on each item seized is alsoincreasing.

This backlog is having a significant impacton the ideal legal process. According to areport by the Garda Síochána Inspectorate[2015] (Irish National Police), delays of up tofour years in conducting digital forensicinvestigations on seized devices have “seriouslyimpacted on the timeliness of criminalinvestigations” in recent years. In some cases,these delays have resulted in prosecutionsbeing dismissed in courts. This issue regardingthe digital evidence backlog is furthercompounded due to the cross-border, intra-agency cooperation required by many forensicinvestigations. If a given country has anespecially low digital investigative capacity, itcan have a significant knock-on effect in aninternational context (James and Jang, 2014).

In this paper, we review relevant recentresearch literature to elucidate thedevelopments and current challenges in thefield. While much progress has been made inthe digital forensic process in recent years,little work has made appreciable progress intackling the evidence backlog in practice.While evidence is lying unanalysed in anevidence store, investigations are often leftwaiting for new leads to be discovered, whichhas serious consequences for following thesenew threads of investigation at a later date. Anumber of practical infrastructuralimprovements to the current forensic processare discussed including automation of deviceacquisition and analysis, Forensics-as-a-Service(FaaS), hardware-facilitated heterogeneous

evidence processing, remote evidenceacquisition, and cross-jurisdictional evidencesharing over the Internet. These infrastructuralimprovements will enable a number of bothnew and improved forensic processes. Thesemay include data visualisation, multi-deviceevidence and timeline resolution, datadeduplication for storage and acquisitionpurposes, parallel or distributed investigationsand process optimisation of existingtechniques. The aforementioned improvementsshould combine to aid law enforcement andprivate digital investigators to greatly expeditethe current forensic process. It is envisionedthat the future research areas presented aspart of this paper will influence furtherresearch in the field.

CURRENTCHALLENGES

Raghavan (2013) outlined five major challengeareas for digital forensics, gathered from asurvey of research in the area:

1. The complexity problem, arising fromdata being acquired at the lowest (i.e.binary) format with increasing volumeand heterogeneity, which calls forsophisticated data reduction techniquesprior to analysis.

2. The diversity problem, resultingnaturally from ever-increasing volumesof data, but also from a lack ofstandard techniques to examine andanalyse the increasing numbers andtypes of sources, which bring aplurality of operating systems, fileformats, etc. The lack ofstandardisation of digital evidencestorage and the formatting ofassociated metadata also unnecessarilyadds to the complexity of sharingdigital evidence between national andinternational law enforcement agencies(Scanlon and Kechadi, 2014).

Page 3: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 11

3. The consistency and correlationproblem resulting from the fact thatexisting tools are designed to findfragments of evidence, but not tootherwise assist in investigations.

4. The volume problem, resulting fromincreased storage capacities and thenumber of devices that storeinformation, and a lack of sufficientautomation for analysis.

5. The unified time-lining problem, wheremultiple sources present different timezone references, timestampinterpretations, clock skew/drift issues,and the syntax aspects involved ingenerating a unified timeline.

Numerous other researchers have identifiedmore specific challenges, which can generallybe categorised according to Raghavan’s aboveclassification. Examples include Garfinkel(2010), Wazid et al. (2013), and Karie andVenter (2015).

It is widely agreed that the volume of datathat is potentially relevant to investigations isgrowing rapidly. The amount of data per caseat the FBI’s 15 regional computer forensiclaboratories has grown 6.65 times between2003-2011, from 84GB to 559GB (Roussev etal., 2013). One cause of this is the growth instorage capacities that has occurred in recentyears. Additionally, the increasing proliferationof mobile and Internet of Things devices addsto the number of devices that requireexamination in a given investigation. Beyondthe magnitude of the data, the use of cloudservices means that it may not be clear whatdata exists and where it is actually located.

As advanced mobile and wearabletechnologies have continued to become moreubiquitous amongst the general population,they also now play a more prevalent role indigital forensic investigations. Over the pastdecade the capabilities of these smart deviceshave reached a point where they can function

at a level near to that of the average householdcomputer and are currently only limited byprocessing power and storage capacity. Thiscontributes to the diversity problem, where agreater variety of devices become candidatesfor digital forensic investigation (e.g. Baggili etal. [2015] has reported on forensics on smartwatches). Mobile and IoT devices make use ofa variety of operating systems, file formats andcommunication standards, all of which add tothe complexity of digital investigations. Inaddition, embedded storage may not be easilyremovable from devices, unlike for traditionaldesktop and server computers, and in somecases, devices will lack persistent storageentirely, necessitating expensive RAMforensics.

Investigating multiple devices alsocontributes to the consistency and correlationproblem, where evidence gathered from distinctsources must be correlated for temporal andlogical consistency. This is often performedmanually; a significant drain on investigators’resources. The requirements for RAM forensicsalso becomes pertinent in cases of anti-forensics, where a digital criminal takesmeasures to avoid evidence being acquired,including the creation of malware that residesin RAM alone. The increasing sophistication ofdigital criminals’ activities is also a substantialchallenge.

Other issues include limitations onbandwidth for transferring data forinvestigation, the volatility of evidence, thefact that digital media has a limited lifespanthat may possibly result in evidence being lost,and the increasing ubiquity of encryption inmodern communications and data storage.

The following sections concentrate on anumber of important emerging trends inmodern computing that contribute to theproblems outlined above.

Internet-of-Things

Page 4: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 12 © 2016 ADFSL

The Internet-of-Things (IoT) refers to a visionof everyday items that are connected to anetwork and send data to one another. JuniperResearch (2015) estimates that there arealready 13.4bn IoT devices in existence 2015,and they expect this figure to reach 38.5bn by2020. These IoT devices are typically deployedin two broad areas: in the consumer domain(smart home, connected vehicles, digitalhealthcare) and in the industrial domain(retail, connected buildings, agriculture). SomeIoT devices are commonplace items that haveInternet connectivity added (e.g. refrigerators,TVs), whereas others are newer sensing oractuation devices that have been developedwith the IoT specifically in mind.

The IoT has the potential to become a richsource of evidence from the physical world, andas such it poses its own unique set ofchallenges for digital forensic investigators(Hegarty et al., 2014). Compared to traditionaldigital forensics, there is less certainty in wheredata originated and where it is stored. Datapersistence may be a problem. IoT devicestypically have limited memory (and may haveno persistent data storage). Thus any datathat is stored for longer periods may be storedin some in-network hub, or sent to the cloudfor more persistent storage. Therefore, thismeans that the challenges related to cloudforensics (as discussed below in Section 2.2)will likely apply in the IoT domain also.

Already, some efforts have begun toanalyse IoT devices for forensics purposes (e.g.Sutherland et al. [2014] on smart TVs),however this work is in its early stages atpresent. The heterogeneous nature of IoTdevices, including differences in operatingsystems, file systems and communicationstandards, adds significantly to the complexity,diversity, and correlation problems for forensicinvestigators.

Ukil et al. (2011) outline some securityconcerns of IoT researchers, which feed directly

into the desires of forensic investigators,incorporating issues such as availability,authenticity, and non-repudiation, which areimportant for legally-sound use of the data.These are addressed using encryptiontechnologies, which are easy to incorporateinto computationally powerful devices that areconnected to mains energy. However, itbecomes more of a challenge for smaller,battery-operated, computationally constraineddevices where such considerations may besacrificed. This has inevitable consequences forthe usefulness of the data in a legal context.

Emerging Cloud Computingor Cloud Forensic Challenges

Usage of cloud services such as Amazon CloudDrive, Office 365, Google Drive, and Dropboxare now commonplace amongst the majority ofInternet users. From a digital forensics point ofview, these services present a number of uniquechallenges, as has been reported in the 2014National Institute of Standards andTechnology’s draft report (NIST, 2014).Typically, data in the cloud is distributed overa number of distinct nodes, unlike moretraditional forensic scenarios where data isstored on a single machine. Due to thedistributed nature of cloud services, data canpotentially reside in multiple legaljurisdictions, leading to investigators relying onlocal laws and regulations regarding thecollection of evidence (Simou et al., 2014, Ruanet al., 2013). This can potentially increase thetime, cost, and difficulty associated with aforensic investigation. From a technicalstandpoint, the fact that a single file can besplit into a number of data blocks that arethen stored on different remote nodes addsanother layer of complexity thereby makingtraditional digital forensic tools redundant(Chen et al., 2015, Almulla et al., 2013).

Additionally, the Cloud Service Providers(CSP) and their user base must be taken into

Page 5: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 13

consideration. Investigators are reliant on thewillingness of CSPs to allow for the acquisitionand reproduction of data. The lack ofstandardisation among the varying CSPs,differing levels of data security, and theirService Level Agreements are obstacles to bothcloud forensic researchers and investigators(Almulla et al., 2013). The multi-tenancy ofmany cloud systems poses three significantchallenges to digital forensic investigations. Inthe majority of cases the privacy andconfidentiality of legitimate users must betaken into account by investigators due to theshared infrastructures that support cloudsystems (Morioka and Sharbaf, 2015). Thedistributed nature of cloud systems, along withmulti-tenancy, can require the acquisition ofvast volumes of data leading to many of thechallenges outlined below. Finally, the use ofIP anonymity and the easy-to-use features ofmany cloud systems, such as requiring minimalinformation when signing up for a service, canlead to situations where identifying a criminalis near impossible (Chen et al., 2012, Ruan etal., 2013). Cloud forensics also face a numberof challenges associated with traditional digitalforensic investigations. Encryption and otherantiforensic techniques are commonly used incloud-based crimes. The limited time for whichforensically-important data is available is alsoan issue with cloud-based systems. Due to thefact that said systems are continuously runningdata, can be overwritten at any time. Time ofacquisition has also proved a challenging taskin regard to cloud forensics. Thethi and Keane(2012) showed that commonly-used forensictools such as the Linux dd command andAmazon’s AWS Snapshot took a considerableamount of time to acquire 30Gb of data from acloud service.

While advances continue with regard tothe tools and techniques used in cloudforensics, the aforementioned challengescontinue to impede investigations. Henry et al.

(2013) produced results showing thatinvestigations on cloud-based systems make uponly a fraction of all digital forensicinvestigations. Many investigations are stalledbeyond the point of a perpetrator’s owneddevices and rarely extend into the cloud-basedservices they use. Results such as these form astrong argument for continued research in thisfield.

FUTURE RESEARCHDistributed Processing

Distributed Digital Forensics has beendiscussed for some time (Roussev and RichardIII, 2004, Shanmugasundaram et al., 2003,Garfinkel et al., 2009, Beebe, 2009). However,there is more scope for it to be put intopractice. Roussev et al. (2013) cite two mainreasons that the processing speed of currentgeneration digital forensic tools is inadequatefor the average case: First, users have failed toformulate explicit performance requirements;second, developers have failed to putperformance as a top-level concern in line withreliability and correctness. They proposed andvalidated a new approach to target acquisitionthat enables file-centric processing withoutdisrupting optimal data throughput from theraw device. Their evaluation of core forensicprocessing functions with respect to processingrates shows intrinsic limitations in bothdesktop and server scenarios. Their resultssuggest that with current software, keeping upwith a commodity SATA HDD at 120 MB/srequires between 120 and 200 cores.

HPC and Parallel ProcessingDespite the bottleneck of many digital forensicoperations being disk read-speed, there aresteps in the process that are not limited by thephysical read-speed of the storage device. Forinstance, the analysis phase can consume largeamounts of time by computers and humans.High performance computing (HPC)

Page 6: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 14 © 2016 ADFSL

advantages should be employed whereverpossible to reduce computation time, and in aneffort to reduce the time required by humans.Traditional HPC techniques normally exploitsome level of parallelism, and to date havebeen underexploited by the digital forensiccommunity. There are many applicationswhere HPC techniques and hardware could beemployed, for instance, on expediting each partof the digital forensic process after theacquisition phase, i.e., preprocessing, storage,analysis, and reporting.

GPU-Powered Multi-threading

GPUs excel at “single instruction, multipledata” (SIMD) computations with largenumbers of general-purpose stream processorsthat can execute massively-threadedalgorithms for a number of applications andstand to do so for many digital forensicsrequirements in theory.

Marziale et al. (2007), noted that GPUshave traditionally been both difficult toprogram and targeted at very specificproblems. More recently, multicore CPUscoupled with GPU accelerators have beenwidely used in high-performance computingdue to better power efficiency andperformance/price ratio (Zhong et al., 2012).In addition, there is now a multitude ofintegrated GPUs that are on the same silicondie as the CPU, bringing both easierprogramming models and greater efficiency.

With new heterogeneous architectures andprogramming models such as these, powerfuland efficient computer systems can be found inworkstations with transparent access to CPUvirtual addresses and very low overhead forcomputation offloading, and Power et al.(2015) have shown such architectures to beadvantageous in analytic processing. Theseseem very well suited for many digital forensics

applications, particularly as technologies suchas SSDs reduce the I/O bottleneck.

Nonetheless, the use of GPUs in digitalforensics is largely absent from the literatureand there are few standard digital forensictools that utilise GPU acceleration. Marziale etal. (2007) measured the effectiveness ofoffloading processing typical to digital forensicstools (such as file carving) to GPUs and foundsignificant performance gains compared tosimple threading techniques on multicoreCPUs. Although the programming of theGPUs was more complex, the authors foundthat the effort was worth the performancegains. Collange et al. (2009) researched thefeasibility of employing GPUs to accelerate thedetection of sectors from contraband files usingsector-level hashes.

Their application was able to inspectseveral disk drives simultaneously andasynchronously from each other. In addition,disks from different computers can beinspected independently by the application.This approach indicated that the use of GPUsis viable.

However, Zha and Sahni (2011) employedmulti-pattern search algorithms to reduce thetime needed for file carving with Scalpel,showing that the limiting factor forperformance is disk read time. The authorsstate there is no advantage to using GPUs, atleast until mechanisms to read the disk fasterare found. However, this conclusion assumesonly one disk, and the traditional digitalforensic model. In the new era of cloudforensics, SSDs, and other technologicalevolutions, this I/O bottleneck will be muchless restrictive.

Iacob et al. (2015) have employed GPUs ininformation-retrieval cases where response timeis of importance, similar to Digital Forensics.They demonstrate significant speed-up of twoBloom filter operations, which are used in

Page 7: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 15

approximate matching forensic applications(Breitinger and Roussev, 2014).

GPUs, like many new technologies, presentnew considerations for digital forensics. Breß etal. (2013) researched the use of GPUs toprocess confidential/sensitive information andfound that data in GPU RAM is retrievable byunauthorised users by creating a dump ofdevice memory. However, this does not impedethe use of GPUs for processing confidentialinformation when the system itself is onlyaccessible to authorised users

DFaaSDigital Forensics as a Service (DFaaS) is amodern extension of the traditional digitalforensic process. Since 2010, the NetherlandsForensic Institute (NFI) have implemented aDFaaS solution in order to combat the volumeof backlogged cases (van Baar et al., 2014).This DFaaS solution takes care of much of thestorage, automation, investigator enquiry inthe cases it manages. Van Baar et al. (2014)describe the advantages of the current systemincluding efficient resource management,enabling detectives to directly query the data,improving the turnaround time betweenforming a hypothesis in an investigation itsconfirmation based on the evidence, andfacilitating easier collaboration betweendetectives working on the same case throughannotation and shared knowledge.

While the aforementioned DFaaS system isa significant step in the right direction, manyimprovements to the current model couldgreatly expedite and improve upon the currentprocess. This includes improving thefunctionality available to the case detectives,improving its current indexing capabilities andon-the-fly identification of incriminatingevidence during the acquisition process (vanBaar et al., 2014).

Seeing as the DFaaS model is a cloud-based, remote access model, two significant

disadvantages to the model are potentiallatency in using the online platform and beingdependent on the upload bandwidth availableduring the physical storage acquisition phase ofthe investigation. A deduplicated evidencestorage system, such as that described byWatkins et al. (2009), would facilitate thefaster acquisition with each unique file across anumber of investigations only needing to bestored, indexed, analysed, and annotated onceon the system. Eliminating non-pertinent,benign files during the acquisition phase of theinvestigation would greatly reduce theacquisition time (e.g., operating system,application, previously acquired non-incriminating files, etc.). This could greatlyexpedite pertinent information being availableto the detectives working on the case as earlyas possible in the investigation. In order forany evidence to be court admissible, aforensically sound entire disk image wouldneed to be reconstructible from thededuplicated data store, improving upon thesystem proposed by Watkins et al. (2009).Employing such a system would also facilitatea cloud-to-cloud based storage eventmonitoring of virtual systems as merely thechanges of the virtual storage would need to bestored between each acquisition.

Field-programmable GateArrays

FPGAs are integrated circuits that can beconfigured after manufacture. FPGAs canimplement any function that application-specific integrated circuits can, and offerseveral advantages over traditional CPUs.FPGAs can exploit inherent algorithmicparallelism (including low-level parallelism),and can often achieve results in fewer logicoperations compared to traditional generalpurpose CPUs, resulting in faster processingtimes. FPGAs have recently found applicationin areas such as digital signal processing,imaging and video applications, and

Page 8: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 16 © 2016 ADFSL

cryptography. Despite demonstrating desirabletraits for digital forensics researchers, theyhave yet to be exploited for non-I/O-boundfacets of digital forensics. Furthermore, asSSDs and other technologies ease the I/Obottleneck, FPGAs stand to be more broadlyapplicable in digital forensics.

Applying ComplementaryCutting Edge Research to

ForensicsCurrent investigation practice involves theanalysis of data on standalone workstations.As such, the sophistication of the techniquesthat can be practically employed are limited.Much research has been conducted in a varietyof areas that have theoretical relevance todigital forensics, but also have been impracticalto apply to date. A movement towards DFaaSand high-performance computing, as discussedabove, offers advantages beyond merelyexpediting the techniques currently used inforensics investigations, which remain relianton manual input. It also promises a situationwhere this complementary research maypractically be brought to bear on digitalforensic investigations.

One such research area is that ofInformation Retrieval (IR). Traditionally, IR isconcerned with identifying documents within acorpus that help to satisfy a user’s“information need.” Traditionally, IRresearchers have been faced with the trade-offbetween the competing goals of precision(retrieving only relevant documents) and recall(retrieving all the relevant documents),whereby improving on one of these metricstypically results in a reduction in the other. InIR for legal purposes, recall has long beenacknowledged as being the more importantmetric, given that a single missing relevantdocument could have serious consequences forthe prosecution of a criminal case, theenforcement of a contract, etc. However,

focusing on recall frequently results in aninvestigator being required to manually siftthrough a large quantity of non-relevantdocuments. This is in contrast to web search,for example, where users typically do notrequire all of the relevant documents to beretrieved, of which there may possibly bemillions. Instead, a web searcher wishes toavoid wasting time on non-relevant material.

IR for digital forensics is often seen as atypical example of legal information retrieval(e.g. by Beebe and Clark [2007]). Although,this is certainly true at the point a case isbeing built for court, it could be argued thatthe level of recall required at the triage stagecan be sacrificed somewhat for greaterprecision in order to allow investigators tomake speedy decisions about whether a givendevice should be investigated fully. Thus, thereis the potential for configurable IR systems tobe utilised in forensics investigations, whosefocus will change depending on the stage of theinvestigation.

The primary advantage of applying IRtechniques to digital investigations is that oncethe initial preprocessing stage has beencompleted, searches can be conductedextremely quickly. Furnas et al. (1987) hasshown that less than 20% of searchers choosethe same keywords for topics they areinterested in. This suggests that many queriesmust be run to achieve full recall, and alsosuggests that standard IR techniques such asquery expansion and synonym matching couldalso be applied to increase recall.

However, increasing recall typically reducesprecision by also retrieving non-relevantdocuments as false positives. There are anumber of ways in which this problem can bealleviated. The use of the aforementioned datadeduplication techniques would eliminatestandard system files from consideration(Beebe and Dietrich [2007] note that the word“kill” appears as a command in many system

Page 9: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 17

files). Additionally, common visualisationapproaches such as ranking (Beebe and Liu,2014) and clustering (Beebe et al., 2011) arelikely to help investigators in their manualsearch of retrieved documents.

Another consideration is that eventtimeline reconstruction is extremely importantin a criminal investigation (Chabot et al.,2014). When constructing a timeline fromdigital evidence, some temporal data is readilyavailable (e.g. chat logs, file modificationtimes, email timestamps, etc.), although itshould be acknowledged that even this is notwithout its own challenges. Within the IRcommunity, much research has been conductedinto the extraction of temporal informationfrom unstructured text (Campos et al., 2014).This can be used to dramatically reduce themanual load for investigators in this area.

CONCLUSIONIn this paper a number of current challenges inthe field of digital forensics are discussed. Eachof these challenges in isolation can hamper thediscovery of pertinent information for digitalinvestigators and detectives involved in amultitude of different cases requiring digitalforensic analysis. Combined, the negative effectof these challenges is amplified. The digitalevidence backlog is currently in the order ofyears for many law enforcement agenciesworldwide. The predicted ballooning of casevolume in the near future will serve to furthercompound the backlog problem – particularlyas the volume of evidence from cloud-basedand Internet-of-Things sources continue toincrease. In terms of research directions,practices already in place in many ComputerScience sub-disciplines hold promise foraddressing these challenges, including those indistributed, parallel, GPU and FPGAprocessing, as well as information retrievaltechniques. These research directions can beapplied to digital forensics requirements to

help combat the backlog through more efficientallocation of precious digital forensic experttime through the improvement and expeditionof the digital forensic process itself.

Page 10: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 18 © 2016 ADFSL

REFERENCES

Alexandru Iacob, Lucian Itu, Lucian Sasu,Florin Moldoveanu, and Constantin Suciu.Gpu accelerated information retrieval usingbloom filters. In System Theory, Controland Computing (ICSTCC), 2015 19thInternational Conference on, pages 872–876. IEEE, 2015.

Arijit Ukil, Jaydip Sen, and SripadKoilakonda.

Ben Hitchcock, Nhien-An Le-Khac, and MarkScanlon. Tiered forensic methodologymodel for digital field triage by non-digitalevidence specialists. Digital Investigation,13(S1), 03 2016. Proceedings of the ThirdAnnual DFRWS Europe.

Darren Quick and Kim-Kwang RaymondChoo. Impacts of increasing volume ofdigital forensic data: A survey and futureresearch challenges. Digital Investigation,11(4): 273–294, 2014.

E. Morioka and M.S. Sharbaf. Cloudcomputing: Digital forensic solutions. InInformation Technology - New Generations(ITNG), 2015 12th InternationalConference on, pages 589–594, April 2015.

Embedded security for Internet of Things. In2011 2nd National Conference on EmergingTrends and Applications in ComputerScience, pages 1–6. IEEE, mar 2011. ISBN978-1-4244-9578-8.

Frank Breitinger and Vassil Roussev.Automated evaluation of approximatematching algorithms on real data. DigitalInvestigation, 11:S10–S17, 2014.

Garda Síochána Inspectorate. ChangingPolicing in Ireland, November 2015.

George W. Furnas, Thomas K. Landauer,Louis M. Gomez, and Susan T. Dumais.The vocabulary problem in human-systemcommunication. Communications of theACM, 30(11):964–971, 1987.

Guangxuan Chen, Yanhui Du, Panke Qin, andJin Du. Suggestions to digital forensics incloud computing era. In NetworkInfrastructure and Digital Content (IC-NIDC), 2012 3rd IEEE InternationalConference on, pages 540–544, Sept 2012.

Iain Sutherland, Huw Read, and KonstantinosXynos. Forensic analysis of smart TV: Acurrent issue and call to arms. DigitalInvestigation, 11(3):175–178, sep 2014.

Ibrahim Baggili, Jeff Oduro, Kyle Anthony,Frank Breitinger, and Glenn McGee.Watch What You Wear: PreliminaryForensic Analysis of Smart Watches. In2015 10th International Conference onAvailability, Reliability and Security, pages303–311. IEEE, aug 2015. ISBN 978-1-4673-6590-1.

Jason Power, Yinan Li, Mark D Hill, JigneshM Patel, and David A Wood. Toward gpusbeing mainstream in analytic processing.2015.

Joshua I James and Yunsik Jake Jang.Measuring digital crime investigationcapacity to guide international crimeprevention strategies. In FutureInformation Technology, pages 361–366.Springer, 2014.

Juniper Research. The Internet of Things:Consumer, Industrial & Public Services2015-2020, July 2015.

Page 11: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

Current Challenges and Future Research … CDFSL Proceedings 2016

© 2016 ADFSL Page 19

Kathryn Watkins, Mike McWhorte, Jeff Long,and Bill Hill. Teleporter: An analyticallyand forensically sound duplicate transfersystem. Digital investigation, 6:S43–S47,2009.

Keyun Ruan, Joe Carthy, Tahar Kechadi, andIbrahim Baggili. Cloud forensics definitionsand critical criteria for cloud forensiccapability: An overview of survey results.Digital Investigation, 10(1):34 – 43, 2013.

Kulesh Shanmugasundaram, Nasir Memon,Anubhav Savant, and Herve Bronnimann.Fornet: A distributed forensics network. InComputer Network Security, pages 1–16.Springer, 2003.

Lei Chen, Lanchuan Xu, Xiaohui Yuan, and N.Shashidhar. Digital forensics in socialnetworks and the cloud: Process,approaches, methods, tools, and challenges.In Computing, Networking andCommunications (ICNC), 2015International Conference on, pages 1132–1136, Feb 2015.

Lodovico Marziale, Golden G Richard, andVassil Roussev. Massive threading: Usinggpus to increase the performance of digitalforensics tools. digital investigation, 4:73–81, 2007.

Mark Scanlon and M-Tahar Kechadi. Digitalevidence bag selection for p2p networkinvestigation. In Proceedings of the 7thInternational Symposium on DigitalForensics and Information Security (DFIS-2013), pages 307–314. Springer, Gwangju,South Korea, 2014.

Mohammad Wazid, Avita Katal, RH Goudar,and Smitha Rao. Hacktivism trends, digitalforensic tools and challenges: A survey. InInformation & CommunicationTechnologies (ICT), 2013 IEEE Conferenceon, pages 138–144. IEEE, 2013.

Neha Thethi and Anthony Keane. Digitalforensics investigations in the cloud. InIEEE International Advance ComputingConference (IACC), Sept 2012.

Nickson M Karie and Hein S Venter.Taxonomy of challenges for digitalforensics. Journal of forensic sciences,60(4):885–893, 2015.

Nicole Beebe and Glenn Dietrich. A newprocess model for text string searching. InAdvances in Digital Forensics III, pages179–191. Springer, 2007.

Nicole Beebe. Digital forensic research: Thegood, the bad and the unaddressed. InAdvances in digital forensics V, pages 17–36. Springer, 2009.

Nicole Lang Beebe and Jan Guynes Clark.Digital forensic text string searching:Improving information retrievaleffectiveness by thematically clusteringsearch results. Digital Investigation,4(SUPPL.):49–54, 2007.

Nicole Lang Beebe and Lishu Liu. Rankingalgorithms for digital forensic string searchhits. Digital Investigation, 11(SUPPL. 2):314–322, 2014.

Nicole Lang Beebe, Jan Guynes Clark, GlennB. Dietrich, Myung S. Ko, and Daijin Ko.Post-retrieval search hit clustering toimprove information retrieval effectiveness:Two digital forensics case studies. DecisionSupport Systems, 51(4):732–744, 2011.

NIST. NIST cloud computing forensic sciencechallenges. 2014.

Paul Henry, Jacob Williams, and BenjaminWright. The sans survey of digital forensicsand incident response. In Tech Rep, July2013.

RB van Baar, HMA van Beek, and EJ vanEijk. Digital forensics as a service: A game

Page 12: CURRENT CHALLENGES AND FUTURE RESEARCH AREAS FOR DIGITAL FORENSIC INVESTIGATION · 2016-05-25 · Emerging Cloud Computing orCloud Forensic Challenges Usage of cloud services such

CDFSL Proceedings 2016 Current Challenges and Future Research …

Page 20 © 2016 ADFSL

changer. Digital Investigation, 11:S54–S62,2014.

Ricardo Campos, Gaël Dias, Alípio M Jorge,and Adam Jatowt. Survey of temporalinformation retrieval and relatedapplications. ACM Computing Surveys(CSUR), 47(2):15, 2014.

Robert C. Hegarty, David J. Lamb, andAndrew Attwood. InteroperabilityChallenges in the Internet of Things. InPaul Dowland, Steven Furnell, and BogdanGhita, editors, Proceedings of the TenthInternational Network Conference (INC2014), pages 163–172. PlymouthUniversity, 2014.

S. Almulla, Y. Iraqi, and A. Jones. Cloudforensics: A research perspective. InInnovations in Information Technology(IIT), 2013 9th International Conferenceon, pages 66–71, March 2013.

Sebastian Breß, Stefan Kiltz, and MartinSchäler. Forensics on gpu coprocessing indatabases–research challenges, firstexperiments, and countermeasures. InBTW Workshops, pages 115–129. Citeseer,2013.

Simson Garfinkel, Paul Farrell, Vassil Roussev,and George Dinolt. Bringing science todigital forensics with standardized forensiccorpora. Digital investigation, 6:S2–S11,2009.

Simson L Garfinkel. Digital forensics research:The next 10 years. digital investigation, 7:S64–S73, 2010.

Sriram Raghavan. Digital forensic research:current state of the art. CSI Transactionson ICT, 1(1):91–114, 2013.

Stavros Simou, Christos Kalloniatis, EvangeliaKavakli, and Stefanos Gritzalis. Cloudforensics solutions: A review. In LazarosIliadis, Michael Papazoglou, and Klaus

Pohl, editors, Advanced InformationSystems Engineering Workshops, volume178 of Lecture Notes in BusinessInformation Processing, pages 299–309.Springer International Publishing, 2014.ISBN 978-3-319-07868-7.

Sylvain Collange, Yoginder S Dandass, MarcDaumas, and David Defour. Using graphicsprocessors for parallelizing hash-based datacarving. In System Sciences, 2009.HICSS’09. 42nd Hawaii InternationalConference on, pages 1–10. IEEE, 2009.

Vassil Roussev and Golden G Richard III.Breaking the performance wall: The casefor distributed digital forensics. InProceedings of the 2004 digital forensicsresearch workshop, volume 94, 2004.

Vassil Roussev, Candice Quates, and RobertMartell. Real-time digital forensics andtriage. Digital Investigation, 10(2):158–167,2013.

Xinyan Zha and Sartaj Sahni. Fast in-place filecarving for digital forensics. In Forensics inTelecommunications, Information, andMultimedia, pages 141–158. Springer, 2011.

Yoan Chabot, Aurélie Bertaux, TaharKechadi, and Christophe Nicolle. Eventreconstruction: A state of the art.Handbook of Research on Digital Crime,Cyberspace Security, and InformationAssurance, page 15, 2014.

Ziming Zhong, Vladimir Rychkov, and AlexeyLastovetsky. Data partitioning onheterogeneous multicore and multi-gpusystems using functional performancemodels of data-parallel applications. InCluster Computing (CLUSTER), 2012IEEE International Conference on, pages191–199. IEEE, 2012.