d7.1 validation plan - disiem-project.eudisiem-project.eu/wp-content/uploads/2018/11/d7.1.pdf ·...

ProjectDeliverable

D7.1ValidationPlan

ProjectNumber 700692ProjectTitle DiSIEM–Diversity-enhancementsforSIEMsProgramme H2020-DS-04-2015Deliverabletype ReportDisseminationlevel PUSubmissiondate August31,2018(M24)Responsiblepartner EDP–EnergiasdePortugal,S.A.Editor PedroDiasRodriguesRevision 1.2

TheDiSIEMproject has received funding from theEuropeanUnion’sHorizon2020researchandinnovationprogrammeundergrantagreementNo700692.

D7.1

2

EditorPedroDiasRodrigues,EDPContributorsOlivierThonnard,AmadeusMiruna-MihaelaMironescu,AmadeusZayaniDabbabi,AmadeusGustavoGonzalez,AtosMarioFaiella,AtosCagatayTurkay,CityUniversityofLondonFrancesBuontempo,CityUniversityofLondonIlirGashi,CityUniversityofLondonPhongNgueyen,CityUniversityofLondonAbdullahiAdamu,DigitalMRLilianaRibeiro,EDPAdrianoSerckumecka,FCiências.IDAnaRespício,FCiências.IDPedroFerreira,FCiências.IDAlyssonBessani,FCiências.IDMichaelKamp,FraunhoferIAISSimingChen,FraunhoferIAISVersionHistoryVersion Author Description1.0 EDP Firstversionofthedocumentforinternalreview1.1 EDP Documentreleasedtothecoordinatorforquality

checkandsubmission1.2 FCiências.ID FinalminorimprovementsandsubmissiontoEC

D7.1

3

ExecutiveSummaryThe DiSIEM project has defined several goals with the common ambition ofextendingandimprovingexistingSIEMsystems,havingthefollowingprinciplesinmind: (1) nomodification on existing SIEM systems, (2) the exploitation ofexistingextensionports inSIEMplatforms, (3) theexistenceofnon-integratedcomponents,and(4)nosubstantialadditionalconfigurationshouldbeneeded.AreferencearchitectureforthedevelopmentswasestablishedindeliverableD2.2,whilethedescriptionandtechnicalinformationregardingeachcomponentwaslaidoutinthedeliverablesforworkpackages3-6.Thisreportcollectsanddescribesthedesignedscenariosanddevelopmentsmadeduringthefirst24monthsoftheproject,detailingtheirintegrationandvalidationplanusingthepartners’SIEMplatforms.Thecomponentswillbeintegratedwiththe SIEM systems following three different patterns – Event Generator, EventInspector,andEventCollector–thataresupportedbythethreeplatformsselectedforvalidatingtheprojectinnovations:MicroFocusArcSight(providedbyEDP),XL-SIEM(providedbyATOS)andElasticsearch(providedbyAmadeus).Adetaileddescriptionoftheplatformsisprovidedinthisreport,notingthecommonalitiesanddifferencesandhowtheyarerelevanttoensurethatDiSIEMdevelopmentsmaybeintegratedwithmultipleSIEMplatforms.The integration scope is defined in detail, using scenario-based development,consideringthetestbedsmadeavailablebytheindustrypartners.Theinformationpresented here will guide the activities for the last 12 months of the project,including the definition of success criteriawhen integrating the developmentsmadebytheconsortiumpartnersintheprovidedtestenvironments,leadingtopossibledeploymentsinpartnersproductionenvironments.

D7.1

4

TableofContents1 Introduction....................................................................................................................................71.1 ValidationMethodology..................................................................................................71.2 DiSIEMArchitecture.........................................................................................................81.3 SIEMIntegration.................................................................................................................91.4 OrganizationoftheDocument.....................................................................................9

2 DiSIEMComponents.................................................................................................................112.1 Multi-levelRiskManager(WP3)..............................................................................122.2 DiversityandForecastingAnalyticsPlatform(WP6/WP5)........................142.2.1 DiversityandForecastingAnalyticsEngine(WP6)..............................142.2.2 DiversityandForecastingAnalyticsDashboard(WP5)......................16

2.3 OSINTThreatDetector(WP4)...................................................................................172.4 Listening247ThreatAnalyser(WP4)...................................................................182.5 ThreatIntelligenceIntegrator(WP4)....................................................................202.6 Network-basedAnomalyDetector(WP6)...........................................................212.7 SkepticII(WP6)................................................................................................................232.8 UserBehaviourAnalyticsPlatform(WP5)..........................................................252.9 SLiCERCloudArchiver(WP6)...................................................................................26

3 SIEMEnvironments..................................................................................................................293.1 MicroFocusArcSight(EDP)........................................................................................293.2 XL-SIEM(ATOS)................................................................................................................303.3 ElasticStack(AMADEUS).............................................................................................333.4 EnvironmentValidation................................................................................................34

4 IntegrationPlan..........................................................................................................................364.1 IntegrationwithSIEM....................................................................................................364.1.1 Multi-levelRiskManager....................................................................................364.1.2 DiversityandForecastingAnalyticsPlatform..........................................364.1.3 ThreatIntelligenceIntegrator.........................................................................364.1.4 Diversity-enhancedMonitoring......................................................................374.1.5 SLiCERCloudArchiver........................................................................................37

4.2 ComponentIntegration.................................................................................................374.2.1 DiversityandForecastingAnalyticsPlatform..........................................384.2.2 OSINTDataFusionandAnalysisComponents........................................384.2.3 SkepticII.....................................................................................................................39

5 Evaluation......................................................................................................................................405.1 SuccessCriteria.................................................................................................................405.2 SBDValidationScenarios.............................................................................................415.3 EthicalandLegalAspects.............................................................................................45

6 ConclusionsandFutureWork.............................................................................................47References...............................................................................................................................................48

D7.1

5

ListofFiguresFigure1–DiSIEMArchitecture.......................................................................................................8Figure2–DiSIEMarchitectureandcomponents.................................................................11Figure3–Multi-levelRiskManagercomponent..................................................................13Figure4–Dataflowofforecastingengine..............................................................................16Figure5–OSINTthreatdetectorcomponentarchitecture.............................................17Figure6–ThreatIntelligenceIntegratorarchitecture......................................................20Figure7–Network-basedAnomalyDetectionarchitecture..........................................22Figure8–Skepticarchitecture......................................................................................................24Figure9–UserBehaviourAnalyticsPlatformdataflow..................................................25Figure10–SLiCERCloudArchiverarchitecture..................................................................27Figure11–Atostestbed...................................................................................................................32Figure12–DiSIEMcomponentinterfaces..............................................................................38Figure13SkepticIIintegrationwithUserBehaviourAnalyticsPlatform...............39

D7.1

6

ListofTablesTable1–OptionsforListening247ThreatAnalyserintegration.................................19Table2–Listening247ThreatAnalyserusecases..............................................................20Table3–Network-basedAnomalyDetectionmodules....................................................23Table4–Skepticmodules...............................................................................................................24Table5–ATOSTestbedcomponentsandrequirements.................................................33Table6–Amadeustestenvironment.........................................................................................34Table7–Amadeusproductionenvironment.........................................................................34Table8–Componentvalidationtestenvironments...........................................................35Table9–Evaluationsuccesscriteria.........................................................................................41Table10–Scenario1Claims..........................................................................................................43Table11–Scenario2Claims..........................................................................................................45

D7.1

7

1 Introduction

TheDiSIEMprojectaimstoimprovecurrentSecurityOperationsCentres(SOC)bydeveloping a set of extensions for SIEM (Security Information and EventManagement) systems. These components will be validated in three testenvironmentsprovidedbyEDP,Atos,andAmadeus.Afterthat,asubsetofthemwill be deployed on production environments provided by these partners forfurthervalidation.Buildinguponthedesignanddevelopmentactivitiescarriedoutduringthefirst24monthsof theDiSIEMproject, thisdeliverablegathers relevant informationregarding the implementation and validation phase. Based on the DiSIEMreferencearchitectureandconsideringtheSIEMoperationalenvironmentsfromthe partners, requirements and scenarios are defined in order to guide theintegrationofthedevelopedcomponents.We adopted a scenario-based development (SBD) methodology to drive theintegrationactivities,whichisaneffectivewaytoanalyseandcommunicateuserrequirements[Carroll2000].InthecontextofDiSIEMwewillfocusthescenariosontheSOCoperation,transformingtheday-to-dayactivitiesoftheSOCteamsinreal-lifescenariosforcomponentutilization.

1.1 Validation Methodology

By using SBD, we argue that the scenario description will be as accurate aspossibleandtheSOCteamsevaluatingthedevelopedcomponentswillrelatewiththescenarios,sincetheyarebasedontheirdailyactivitiesandneeds,addressingtheshortcomingsofexistingSIEMsystems.Ascenariomustincludethesettingorcontextdescription,actors,goals,actionsandevents.Additionally,eachscenariomustclearlyidentifytheprojectedpositiveandnegativeeffectsofitsapplication.The idea is that each scenario is easilyunderstandablenotonlyby theDiSIEMproject team,butalsobySOCusersandexternalstakeholderssuchas industrypersonnel, development teams and researchers, thus contributing to thedisseminationeffortsoftheproject.The methodology consists of three separate phases – Analysis, Design andEvaluation. In theanalysisphasetherequirementsareassessedusingmethodssuchasinterviewswithexperts,state-of-the-artstudyanddiscussiongroups.Thedesignphaseconsistsofdefiningsolutions,detailingtheirobjectives,methodstoachieve those objectives and user experience considerations. Lastly, theevaluationphaseconsiderstheusageofthecomponent,firstlyinaprototypeandlateronthe finalversion,usingthat feedback to improvethedevelopmentandeventuallyre-designingthecomponent.Anothercriticalstepisthetargetaudienceidentification.Aswementionedbefore,thedevelopedcomponentsmightbeusedbydifferent typesofend-users, fromSOCoperatorstoC-levelmanagers.Therefore,eachscenarioshouldbeevaluatedfromdifferentviewpointsinordertoachieveathoroughanalysisandavalidendproduct.

D7.1

8

In DiSIEMwe are considering 4 different points of view to assess integrationscenarios:

1. SOCoperators: end-users in theSOC teams, theprimaryusersof SIEMsystems;

2. SIEMplatformmanagers:usersresponsiblefortheSIEMplatform,whomust consider its technological evolutions and limitations, as well asassessingtheeffortrequiredtointegratethedevelopedcomponents;

3. Componentdevelopers:usersresponsibleforcomponentdevelopment,whose main concerns are the effort required in order to adapt thecomponent to different SIEM platforms. These developers can also giveprecious feedback regarding the applicability of the components,specifically if the integration outcome corresponded to the theoreticalexpectationscreatedduringthecomponentdescription;

4. C-levelmanagement:seniormanagersintheorganizationthatareabletoassess the relevance and quality of the developed component outputs,especially the ones related with risk management or other business-relevantKeyPerformanceIndicators(KPI).

We believe that taking advantage of these 4 different profiles will yieldmorerobustandusefulcomponentintegrationresults.

1.2 DiSIEM Architecture

Deliverable D2.2 [D22] included a description of the reference architecture ofDiSIEMand itsmaincomponents,whichremainedmostlyvalidandunchangedduring the component design and implementation phase (minormodificationswillbeexplained inthenextchapter).Wereviewthehigh-levelarchitecture inFigure1,builtaroundtheSIEMintegrationthatwillbevalidatedthroughoutWP7.

Figure1–DiSIEMArchitecture

DiSIEM Architecture

5/31/18 2

OSINT Data Sources

(internet)

Reporting

Correlation

Normalization

SIEM

EventStorage

Visualisationand Analysis Tools

OSINT Data Analysis and Fusion

Managed Infrastructure

Diversity-Enhanced Monitoring

Cloud-of-clouds Event Archival

Public Cloud Storage Services

another cloud

…

…

D7.1

9

It is important to recall that DiSIEM defined as a cornerstone that all theimprovements will be made around the basic SIEM functionality, withoutrequiring modifications on the base system. There are four fundamentalarchitecturalprinciplesinDiSIEM[D22].

1.3 SIEM Integration

Alldevelopedcomponentswillbeintegratedandevaluatedinoneormoreofthepartner’s SIEM platforms: Micro Focus ArcSight (EDP), XL-SIEM (ATOS) andElasticsearch (Amadeus).Asexplained indeliverableD2.1 [D21], thevarietyofSIEM platforms available and the fact that they represent some of the mostrelevant commercial and experimental SIEM platforms in the market is verysignificative.Weareconfidentthatsuccessfulvalidationoutcomesinthesefourplatformsprovesthattheinterfacesareflexibleandensuresthatthecomponentsmay be integratedwith other SIEM systems.The components are divided intothreetypes:

• Event Generator: components that generate events for the SIEM. Forinstance,newintrusiondetectorsystemsorOSINTanalysers,whichfeedtheSIEMwithnewinformation;

• Event Inspector: components that access the SIEMdatabase to inspectdata(e.g.,assetsandtheirproperties)andsubsetsofcollectedevents.Thisisespeciallyimportantforvisualisationandanalysistools;

• EventCollector:ThecomponentneedstohavefullaccesstoeventsintheSIEM. An example is any side-systems that do additional processing ofevents,beyondwhattheSIEMiscapableof.

1.4 Organization of the Document

Thischapterpresentedabriefintroductiontotheaimsofthisdocumentandsomebackgroundmaterial, preparing the reader for the rest of the report,which isorganizedas follows.Chapter2describes– inhigh level – the ten componentsdesignedand implemented intheDiSIEMproject,whileChapter3presents theSIEMenvironmentsinwhichthecomponentswillbetestedandintegratedwith.TheintegrationplanislaidoutinChapter4,includingalltherelevantinterfaces

Principle1:DiSIEMcomponentsmustbe integrated toSIEMsystemswithoutrequiringanymodificationsinthebasesystem.

Principle2:ExtensionscanbemadeeitherbyfeedingneweventstotheSIEM,bytaking info fromtheSIEMdatabase,and/orby creatingnewwaystovisualiseSIEMdata.

Principle3:Non-integratedcomponentscanbeusediftheyoperateside-by-sidewiththeSIEM,withoutreplicatinganyexistingfunctionality.

Principle4:NosignificantmanualworkshouldberequiredtosetupandoperateDiSIEMcomponents.

D7.1

10

amid components and between the components and the SIEM platforms. InChapter 5 we address all the validation activities and how effective were theplanned integrations. Concluding the document, Chapter 6 presents our finalremarksandplansforfuturework.

D7.1

11

2 DiSIEM Components

In this chapter, we present the components developed in the DiSIEM project.Figure 2 shows the updated project reference architecture with all thecomponentsandtheirrelationshipwithothercomponentsandtheSIEM.

Figure2–DiSIEMarchitectureandcomponents

Thefigureshowsninecomponents,dividedinfourgroups,inaccordancewiththetypeofenhancedprovidedbythecomponent.TheOSINTDataFusion&Analysisgroup contains the components related with the acquisition, processing, andintegrationofOSINTdata,mostlydeveloped inWP4.TheDiversity-enhancedMonitoringcomponentscomprisetheenhancedsensorsdevelopedintheproject,mostlyinWP6.AsforInfrastructureEnhancements,itisagroupwithasinglecomponent,alsodevelopedinWP6,whichimprovesthecapacityoftheSIEMtosecurelyarchiveeventsforlongtimeperiods.Thelastgroup,VisualisationandAnalysis Tools, contains the components for visual analysis and forecast ofcollectedSIEMdata.MostofthesecomponentsarecreatedinWP5,implementingalsotechniquesdevelopedinWP3.The architecture of Figure 2 updates the reference architecture described indeliverableD2.2inthreeways.First,thecomponentnameswerechangedtomakethem more mnemonic and relatable. Second, two components of theVisualisationandAnalysisToolsgroupthatcouldnotbeusedseparatelywere

SIEMReporting

CorrelationNormalization

EventStorage

Skeptic IINetwork-based Anomaly Detector

PublicCloud

Storage Services

OSINT Data Sources (Internet)

SLiCER Cloud Archiver

Listening 247Threat

Analyser

User Behaviour Analytics Platform

VisualizaNon and Analysis Tools

Diversity-enhanced Monitoring

OSINT Data Fusion & Analysis

OSINT Threat Detector

Infrastructure Enhancements

Threat Intelligence Integrator

Managed Infrastructure

raw eventsenriched eventsbulk data transfer

MulN-level Risk Manager

Diversity and Forecasting Analytics Platform

Diversity and Forecasting

Analytics Engine

Diversity and Forecasting

Analytics Dashboard

D7.1

12

merged in a newDiversity ForecastingAnalytics Platform component. Third, anew component (Multi-level Risk Management) was introduced in thearchitecture.Thiscomponentcorrespondstothematerializationofsomeresultsfromwork package 3 that could not be accurately implemented in any othercomponents.Allthecomponentsrepresentedinthisfigurewillbedescribedinthenextsections.

2.1 Multi-level Risk Manager (WP3)

TheMulti-levelRiskManager(MRM)componentispartofthevisualizationandanalysissetof tools inFigure2. It implements theMulti-levelRiskAssessment(MRA)modelproposedindeliverableD3.1[D31]andallowsonetointegratetheresults of the risk assessment with the SIEM. The risk assessment process ofDiSIEM is asset-oriented. The MRA model considers three levels of assets:services, softwareapplications, andhosts.Hosts supportsoftwareapplications,which can interact between each other, and services are composed by sets ofapplications. Themodel relies on three variables to compute the security riskscoreof eachasset: the riskassociatedwithvulnerabilities, the riskassociatedwithsecurityincidents,andtheriskinheritedfromdependenciesonotherassets.The MRM component architecture is presented in Figure 3. The central sub-componentoftheMRMisthemodel’sdatabase.Thisdatabaseispopulatedwiththeinformationon:

– thecompany’sassetsandthedependenciesbetweenthem,includingthemultiplelevelsdefinedinthemodel(services,softwareapplications,andhosts);

– the infrastructurevulnerabilities(discoveredbya third-partysoftware)andthevulnerabilitiesinthecompany’ssoftwareapplications(discoveredbypen-testingorothersoftwarevulnerabilitiesdiscoverytool);and

– thehistoryofrecentpastsecurityincidents.Topopulatethemodel’sdatabasewithallthisdata,aDBImportmodulemustbeimplemented. Similarly, the results of the risk assessment are exported to theSIEMbyaDBExportmodulethatextractstheinformationrelevantandfeedstheSIEM. The SIEM platform will then be programmed to differentiate alerts forevents involving assetswith higher risk scores to enhance itsmonitoring anddetectioncapability,thusimprovingriskmitigation.TheMRMinteractswiththedatabasetoupdatetheriskscoreoftheassets.Alltheinteraction between the users and the MRM is done through a dashboard,connectedtothedatabase,thatpresentstheriskassessmentresultsandwhereitispossibletoconfigurethemodel’sparameters.Thisdashboardwillprovidetheabilitytodrill-downintothedetailsofsecurityrisk,whilealsoenablinganalyticcapabilities to the SOC team and security managers. Decision-making will besupported by an integrated view of the infrastructure security risk, includinginformation about vulnerability severity and relevant incidents, coupled withinterdependenciesbetweenassetsforadditionalcontext.

D7.1

13

Figure3–Multi-levelRiskManagercomponent

On top of that, security risk analyticswill facilitate the generation of enrichedsecurityreportstosupportC-levelmanagers’decision-makingprocesses.To integrate theMRM componentwith a SIEM, the following sub-componentsmustbecustomised:

– TheDBImportmodule,whichshouldbeabletotransformtheinformationrelated with assets and the supported network, software andinfrastructure vulnerabilities, and past incidents, to the database datamodel;

– TheDBExportmoduletouploadthemodelinformationintotheSIEM.TheSIEMintegrationisflexibletoalloweasydataexchangeregardlessoftheSIEM technology inplace, supportingbothpull andpushmethodsusingstandardformats.

TheparametersoftheMRAmodelcanbefine-tunedinordertobealignedwiththecriteriafortheriskassessmentprocessesinthecompany.Thisadaptabilityisvital for the model, since a rigid approach could not fit the needs of entirelydifferent organizations, operating in various contexts and with a diverse risktolerance/appetite. The ability to customize the MRA model parameters willenable each company to address their own organizational needs, making theresultsrelevantforboththeoperationalandmanagementteams.Aftertheintegrationprocessiscomplete,therulesintheSIEMcorrelationengineshould be parameterized to establish the level of risk considered relevant toproducealertsintheSOCoperationcontext.TheoutputsoftheMRAmodelarethenponderedasanadditionalsourceofinformationthatenablesSOCoperatorstofocusonthemostcriticalassets,whilealsohavingabetterunderstandingoftherisk associated with each asset. As the risk level is an inclusive concept,understood at the different organizational levels, from technical teams to

D7.1

14

executive boardmembers, the integrationof theMRMcomponentoutputswillallow for business value perceptions to reach and influence the day-to-dayactivitiesofoperationsteams.At theoperational level, theeffectivenessof theMRMcomponent translates inaccurately assessing the security risk of company’s assets and consequentlyimproving thequalityof SIEMmonitoringanddetection.Theevaluationof theMRMcomponentefficacyrequiresitscontinuoususageduringacertainperiodoftime.Throughoutthatperiod,oneshouldcomputethenumberofincidentsthatwere detected by the SIEMusing solely the information provided by theMRMcomponent.Boththeutilityandusabilityofthecomponent,especiallywithregardto analytic capabilities, should also be evaluated by SOC analysts and SOCmanagersbymeansofquestionnaires.Atthetacticallevel,C-levelmanagerswillevaluatetheenhancementonthequalityof information concerning the security of services and software applications.QuestionnairestargetingC-levelmanagersshouldbeusedtoassessifsecurityriskawarenessisimproved.

2.2 Diversity and Forecasting Analytics Platform (WP6/WP5)

TheDiversityandForecastingAnalyticsPlatformperformsthediversityanalysisontheeventdataprovidedbyaSIEM,makesthisdataavailableforexplorativeanalysis, implements forecastingmodels andmakes thesemodels available asinteractive modelling elements. The component is developed as two tightlyrelatedsub-componentsdevelopedindifferentworkpackages,theDiversityandForecastingAnalyticsEnginedescribedin[D62]andtheDiversityandForecastingAnalyticsDashboarddescribedin[D52].

2.2.1 Diversity and Forecasting Analytics Engine (WP6)

TheDiversityandForecastingAnalyticsEngineconsumesdatafromaSIEMandusesthatdataforalertfilteringcapabilities,answeringquestionssuchas:

• HowmanyIDSplatformsalertedonthesameevent?• HowmanyeventswereflaggedbyallIDSplatforms?• HowmanyeventsweremissedbyallIDSplatforms?

Moregenerally,theenginecanspecifyhowmanyeventswereflaggedbykoutofthe n tools, for any k between 0 and the total number of tools, n. The samequestionscanbeappliedtoanytoolthatisaneventsourcefortheSIEM,includingfirewalls,antivirussoftware,IDSplatforms,etc.Inourinitialstate-of-the-artstudy,wedeterminedthatsuchanalysisispossiblein some SIEMplatforms, but not all. For example, no SIEM platform currentlyreportseventsmissedbythemonitoringtoolsandnonesuggestsbest/optimalcombination of diverse sensors. Moreover, the alert filtering featureimplementationisalsoSIEMdependent.ThediversitycountinElasticprocessestheinputdatajoiningitintooneindexcreatingavotefield,sowecancountthe“vote”directly(e.g.,7outof9).Anexamplescripttouploaddatainthisformatwillbeprovidedinthefinaldeliverablesoftheproject.

D7.1

15

The results can then be shown in a Kibana dashboard and, for XL-SIEM, theexistingfilterscanbeuseddirectly.Thealertfilteringusecaseallowsinteractivevisualisationsandstatisticsover time, takingeventdataas inputsandoutputscountssuchas1outofn,orkoutofn.ItsusevariesbetweenSIEMplatforms.Asdescribed,ascriptshowshowtojoinorfusethedataforElasticSearch,sothevotecanbefoundeasily.ForSIEMplatformsusingarelationaldatabase,suchasXL-SIEM,thesameresultswillbeachievedviatheirexistingjoinqueries.Theenginealsoenablesthelabellingofspecificeventsofinterest,includingfalsepositives – events flagged as security risks, which in fact were not – and fitsmathematicalmodelsaccordingtoeventtimestamps,aidingpredictionoftimestofutureevents.Theevent labels canbe storeddirectly in someSIEMplatforms,includingElasticandXL-SIEM,whileothersmayrequireadditionalstorage.OncetheeventscollectedbyaSIEMhavebeenlabelled,thealertfilteringisenhancedby enabling counts of true or false positives or negatives, measurement ofaccuracy, sensitivity, specificity and, finally, it can suggest the optimal diverseconfiguration.EventlabellingisnotdirectlyavailableinanySIEMweknowof.Combinedwiththealertfiltering,labelleddataprovidestheSIEMwithenhancedassessmentability.AprototypeGUItomarkspecificeventshasbeendevelopedforElastic.MoregeneralscriptstoflagupspecificrecordsbythespecificsourceIPaddress,etc.canbedevelopedandanopensourcelibraryDejaVu1canbeusedtoupdateElasticdocuments.SIEMsystemsbackedbyarelationaldatabasecanutilisetheprototypeGraphicalUserInterface(GUI)onceappropriatecreate,read,update,delete(CRUD)querieshavebeenwritten.The labellingusecaseallowsmarking of incorrectly classified events as false positive (an alert fired whennothingbadhadhappened)orfalsenegative(somethingbadhappenedthatthetoolsdidn’tspot).IttakesunlabelledeventsstoredinaSIEMandstoresthelabelsdirectlyintheSIEMorinadditionalstorage.TheoperationalstepsvarybetweenSIEMsystems,thoughtheyeitherinvolveusingafrontendorrunningscripts.The forecasting feature aims topredict the timeuntil thenext specific eventofinterest, including false negative, false positive or specific type of attack. Suchmodelsarenotwidelyavailable.R2doeshavesomeofthemodelsweprovide,butnotall. Furthermore,wehaveexpertknowledgeandprovidenovelpost-fittingproceduresanalysisofpredictivequalityandrecalibrationtoimprovepredictionaccuracy.ThemodelstakeeventsofinterestfromaSIEM,labelledwhererequired,andrunasstandaloneexecutables.TheforecastingresultsaremadeavailableasJSONandcanbeexportedinsyslogformattoo.ThisallowstheresultstobesentintoaSIEMfordisplayorforwardedtotheDiversityandForecastingAnalyticsdashboard developed inWP5 and described in Section 2.2.2. In principle, theEnginecanalsobeusedwithouttheDashboard(forexample,anindustrialpartnermaycalltheEngine’sAPIdirectlyfromtheirSIEMsystem),soitcanbeusedasaself-contained component. The display can also include details to assess the

1https://github.com/appbaseio/dejavu2https://www.r-project.org

D7.1

16

predictiveaccuracy,usingReliabilityGrowthModels(RGM).TheprocessisshowninFigure4.

Figure4–Dataflowofforecastingengine

Theforecastingusecaseenablesanadvancedwarningofpotentialproblems,butalsotheassessmenttoseeifthenumberofrelevantfailuresinthreatdetection(e.g.,failureofanIDS,ortwoIDSplatformsworkingonadiverseconfiguration)isdecreasingovertime.Themodelsuseeventtimestampsasinputandtheyoutputthepredictedtimeuntilthenextevent,accordingtovariousmodels,togetherwiththeuncertaintyassociatedwiththatprediction.

2.2.2 Diversity and Forecasting Analytics Dashboard (WP5)

Theoutputsoftheenginecanbeanalysedusingadashboard,asdescribedinthissection, forming the Diversity and Forecasting Analytics Platform, a singlecomponent that aims to provide insight into the diverse configurations of themonitoring tools and theirpredictive capabilities.Theultimategoal is tobuildsolutionsforSOCoperatorsandcybersecurityanalysts,helpingthemmakebetterinformeddecisionsduringboththedesignofadefencestrategyandduringtheevaluationofongoingattacksandanomalies.Intermsofpipeline,thedashboardelementreceivesdatafromtheSIEM,invokestheengineelement to applymodelling, visualises themodeloutput, andoffersinteractivecapabilityfortheanalyststofilterthedataandrefinethemodels.Morespecifically,threemainstagesoftheanalysisaredescribedbelow.Stage1–OverviewofalertsThealertdatacomesfromtheSIEMand,atthisstage,noinformationonwhetherthealertsarerealattacksornotisyetavailable.Duetothisfact,thenatureoftheanalysisispurelyexploratoryatthisstageandhasthegoalofgaininganoverviewofhowthealertsaredistributedovertimeandoverdifferentconfigurations,andtoidentifytrendsandoutliers.Thedashboardaimsto:

- DisplaytheoveralldistributionofalertsbrokendownbytimeandotherinformativeattributessuchasprotocolsandIPaddresses;

- Filterandfocusonaparticulartimeperiodorattributevalue.

Stage2–InteractiveexplorationofdiverseconfigurationsThedashboardinvokestheenginetogetlabelledalerts,i.e.,dataonwhetherthealertisassociatedwitharealattackornot.Thedashboardenablesanalyststo:

- Manuallyadjustwhattheperformancemetricsare;

8/31/18 2

Dashboard: Extract data

from SIEM and Generate inter-

event times

Formatdata for input to the RGM

Run

RGM

Generate Outputs from the

RGM

JSON or syslog

outputs to dashboard

or SIEM

D7.1

17

- Filterthealertstofocusonaparticularsubset(e.g.,onlyfalsepositives);- Filtertimeperiods;- Observe changes over time and/or performance during a particular

instance,suchasanattack.Stage3–AnalysisandevaluationofmodelensemblesThisstageoftheinvestigationinvolvesacombinationoftheanalysismodellingoutputs with the aim of evaluating the forecasts for future potentialvulnerabilities.Thedashboardvisualisesthemodeloutputfromtheengineandenablesanalyststo:

- Visuallyinvestigateseveralmodelswiththeirforecastsinasynopticway;- Visualisetheuncertaintyinthepredictions;- Relate the predictedmodels to past rawdata to provide context to the

predictions.

2.3 OSINT Threat Detector (WP4)

TheOSINTThreatDetector(OTD)component,asdefinedin[D41],ispartoftheOSINT data fusion and analysis group shown in Figure 2. It is designed as asubscription-basedservicewhosepurposeistoprovideorganizationswithtimelyawareness on the threat landscape regardingassets in their IT Infrastructures(ITI)of interest,as identifiedbytheorganization.EachsubscriberorganizationmaydefinemultipleITIforwhichtheOTDwillstartcollectinginformationfromTwitterandraisingalarmswithIndicatorsofCompromise(IoC)relatedtoexistingandemergingthreats.TheOTDcomponentarchitectureispresentedinFigure5.Therearefourhigh-levelsub-systemsinthecomponent:atweetcollector;atweetprocessingunit;IoCcommunicationchannels;andawebserver.

Figure5–OSINTthreatdetectorcomponentarchitecture

The tweet collector is responsible for gathering all the tweets from selectedgroups of accounts. The tweets will be filtered, classified and aggregated(clustered) in theprocessingunit, inordertoformconsistentgroupsof tweets

D7.1

18

thatwillbefurtherprocessedandconvertedintoanIoCformat.Inturn,theIoCdiscoveredwillbesenttotheSOCbyoneormorechannels:twitter;email;MISP;and SIEM. A web server will provide analysis dashboards and a managementinterface.Regardless of the channel throughwhich the analysts receive IoC providing asummaryofthethreat,thesewillcontainahyperlinktotheanalysisdashboards.Thedashboardswillprovidecompleteinformationregardingthethreatlandscapeidentifiedbythesystem:

• whichITIareaffectedand,foreachITI,whichassetsareaffected;• whichthreatsarecurrentlyactive,andwhichattractmoreattention;• accessallthetweetsrelatedtoanIoC;• analysehistoricalIoCdata.

Themanagementinterfacehastwomainfunctions:managingusercredentialsandprofiles;andmanagingtheITIdetailsandcharacteristics,whichincludescreatingordeletingboththeITIandassetsbelongingtoit.Twoprofilesareenvisagedtoaccess the component web interface: service managers identified by theorganization,andSOCoperators.Besidesaccessingtheanalysisdashboards,theservicemanagerswillbeallowedtomanagetheITI.SOCoperatorswillonlyhaveaccesstotheanalysisdashboards.BesidesintegratingwiththeSIEMplatformsandMISP,thecomponentusesemailandTwittertotargetanalystsdirectlyinordertogettheirimmediateattention.ByintegratingwithMISP,thecomponentmayconnecttothreatsharingplatformsoranyothersystemsthatuseMISP. IntegrationwiththeSIEMcanbeachievedthroughMISPordirectlyusingSIEMconnectors.Consideringthat threatsaredisclosedandconfirmedsometimeafter theystartaffectinganITI,theevaluationofthecomponentefficacyandefficiencycanonlybe conclusive after the threats are confirmed, thereforewith some delay. TheefficacyoftheOTDcomponentisrelatedtotheabilityofdiscoveringthreatsthataffecttheITIsdefined.ThisentailsfindingallthethreatsthataffectedanITIduringacertainperiodoftimeandcomputethepercentagethatwherediscoveredbytheOTD.Toevaluate theOTDefficiency in the considered timeperiod,wehave toassess the timelinesswithwhich threatswerediscovered, the relevanceof thethreats(impactontheITI),howtheywereprioritizedandtheactionabilitythatwasprovidedbythegeneratedIoC.

2.4 Listening 247 Threat Analyser (WP4)

The Listening247 Threat Analyser component aggregates information fromvariouspubliclyaccessiblesourcesontheWorldWideWeb,includingExploit-DB,NISTNVD(NISTNationalVulnerabilityDatabase),boards,news,blogs,Twitter,etc.ItalsointegratesinformationfromtheDarkWebandaggregatesitwithalltheother sources of information. All this information is aggregated according to aschema based on the NIST NVD, which avoids re-inventing the wheel whilstkeeping it familiar to the end users. The threat analyser consists of machinelearningmodels that predict threat severity (e.g., CVSS 2.0BaseMetric), threat

D7.1

19

type(e.g.,theCommonWeaknessEnumeration,orCWE,codethatclassifiesthethreatbymodusoperandi),andthetargetedplatform.Thismetadatageneratedfrom OSINT data by the threat analyser will give the SOC operators vitalinformationabouttheexposureoftheorganization'sinfrastructure,thenatureofexposure as well as the potential impact of these vulnerabilities if exploited.Together with other information, such as the timestamp, it provides vitalinformation that feeds theSOC teamwith information, allowing it toprioritizevulnerability analysis, and also quantify their degree of exposure in terms ofnumberofthreatswithhigh,medium,andlowimpactfortheinfrastructure.TheListening247ThreatAnalyserreliesontheListening247platform3forOSINTdata. However, it also gives the flexibility of integrating it with other OSINTsourcesviatheRESTAPI.TheListening247platformgathersOSINTdatabasedonakeywordsearchquerythat,inthecaseofDiSIEM,arethenamesofinfrastructuresuchassoftwareandhardware.Itprovidestheoptiontochoosetherangeofdatestogatherdatafromandalsoselectingthesourcesofinterest(suchasblogs,videos,reviews,etc.)beforeharvestingdata.TheplatformallowstheSOCoperators toenriching the harvested data using the threat analysis models. The enrichedinformationcanthenbefetcheddirectlyfromourElasticsearchdatabaseviaitsRESTAPIwhere itcanbeusedasa feed for theSIEM.Table1summarisestheintegrationoptionsavailable.

Method Description

ElasticsearchRESTAPI

DirectlyaccessthedataenrichedbythethreatanalyserfromourElasticsearchdatabases.

MISPIOCMessages

GetOSINTdataintheformofMISPmessagestoyourMISPinstancefromeverytimenewdataisavailable.

Table1–OptionsforListening247ThreatAnalyserintegration

The informationresulting fromthethreatanalysercanbeused inrelevantusecases,asdescribedinTable2.

UseCase ExpectedOutcomes

Dashboard

ThedashboardallowstheSOCoperatorstohaveanoverviewofthedimensionsofexposuretheyhave.Thisdashboardcanbeusedtodeterminewhichvulnerabilitiesshouldbeaddressedfirstbyexploringthehighestseveritythreatsforthecriticalinfrastructure.Thisdashboardcanalsobeusedtoexploretrendsintermsofthetypeofvulnerabilities,andthetargetedinfrastructuretomakedecisionaboutbudgetplanning(e.g.,procurementofmoresecurehardware).Thedashboardwillprovidethefollowingcapabilities:

3https://www.digital-mr.com/solutions/social-media-listening

D7.1

20

• InteractivelyexplorethetimeseriesofOSINTdatatoseethedistributionofHIGH,MEDIUMandLOWseveritythreatsandalsoseecountforeachofthoseclassesofthreatseverities;

• AbilitytointeractivelyexploretheCWEtop10threattypes;

• Abilitytointeractivelyexplorethetop10affectedplatformsorsystemsintheinfrastructure;

• Abilitytointeractivelyexplorespecificsourcesofinformation,suchasListening247,Exploit-DB,NIST,andDarknet;

• AbilitytoreadtheOSINTdata’sactualtext.

MISPAlerts/JSONMessages

TheMISPalertscanbeusedfordirectintegrationwiththeSIEMplatforms,enablingthereal-timealertcapabilitiesforOSINTdata.Specifically,theSIEMwillbeabletoalerttheSOCoperatorsusingexternalevents,suchasatweetaboutarelevantvulnerabilitythatisexploitedusingremoteconnections.Itcanalsocontributetotheriskscorecalculations,usingthevulnerabilityimpactclassification.

Table2–Listening247ThreatAnalyserusecases

2.5 Threat Intelligence Integrator (WP4)

ThescopeoftheThreatIntelligenceIntegratoristocorrelatebothstaticandreal-time information, such as Indicators of Compromise (IoC), associated to themonitoredinfrastructure,withdatacomingfromexternalOSINTdatafusionandanalysistools,toinferifthelattercouldrepresentrealintelligencefortheSIEM.The result of this processwill be the generation of a Threat Score, computedconsideringvariouscriteriarelatedtothreatintelligencedomain(e.g.,relevance,accuracy,timeliness,variety,completeness).ThisadditionaldatawillbeusedforenrichingtheincomingOSINTinformation,beforesendingittotheSIEM,allowingtheprioritizationofreceivedcybersecurityevents.

Figure6–ThreatIntelligenceIntegratorarchitecture

D7.1

21

Thearchitecture,depicted inFigure6, is composedof twomainmodules: (i) aMISP Instance, used to gather data from OSINT-based components and themonitoredinfrastructures.TheoutputwillbeanenrichedIoCsentdirectlytotheSIEMsystem;and(ii)theHeuristicModule,inchargeofcomputingathreatscore,throughaheuristicanalysis,enrichingtheoriginaldatacomingfromOSINTdatafusionandanalysistools,andstoredintheMISPinstance.TheintegrationbetweentheSIEMandtheThreatIntelligenceIntegratorwillrelyon MISP. The objective is to use, as much as possible, the built-in sharingcapabilitiesoftheplatformwhenthisinteractiontakesplace,suchasthezeroMQ4publish-subscribemodel.However,MISPcomeswithso-called“MISP-modules”,usedboth forad-hoc importandexportof threat information,whichrepresentanotherpossiblesolution.Moreover,newmodulescouldbecreatedfromscratchandintegratedwithitwithoutmodifyingthecorefunctionalitiesoftheplatform,ifneeded.ThedetailsforSIEMintegrationaredescribedinChapter4.The finalaimof thiscomponent is toenrichthe incomingIoC fromOSINTdatafusion and analysis toolswith a threat score,which represents the result of aheuristic analysis, performed correlating this data with static and dynamicinformation detected by themonitored infrastructure. The refined IoCwill be,then, sent to the related SIEM, allowing SOC operators to prioritize incomingevents.Thisrepresentsalsotheexpectedoutcomeofthefinalintegration.Thecomponentwillbeevaluatedconsideringvaliddatasetsfromeachmonitoredinfrastructure,usedforperformingthecorrelationwithrealIoCcomingfromtheOSINT data fusion and analysis tools. However, for making the first tests,simplifieddatasetscouldalsobeused,forevaluatingthefirstresultsgivenbytheHeuristicEngine.

2.6 Network-based Anomaly Detector (WP6)

The Network-based Anomaly Detector component monitors and assessesnetwork traffic, in real time, by using machine learning algorithms to detectanomalous behaviours in the monitored applications. It considers severalconditions, like the number of connections between different servers, theconnectionsandtrafficbetweentheportsusedbytheapplications,etc.todefinetheboundariesof legitimate traffic andmake thepredictionsaboutanomalousandpotentiallymalicioustraffic.TheoutcomeoftheintegrationwithaSIEMisthetimelydetectionofanomaliesorsignificant deviation in the behaviour of studied applications (e.g., Gitlab andOwnCloud),makingitpossibletodetectmaliciousactivitiesbeforetheygenerateasecurityincident.Thisintegrationwillbeaccomplishedthroughatextfilewiththeresultsfromthemachinelearninganalysis.SuchfileismadeavailablefortheSIEM,sothatitcanbe read and processed,making itpossible to show in the SIEMdashboard the

4http://zeromq.org/

D7.1

22

comparisonofthestudiedmachinelearningalgorithmsintermsofperformance,time,accuracy,andsuccessfulpredictions.Forevaluationpurposes,themachinelearningalgorithmsmustbetestedagainsta valid dataset, containing both legitimate and malicious traffic. During thetrainingandanalysisprocess,onlylegitimatetrafficmustbeconsidered.Whereasduringthepredictionprocess,both legitimateandanomalous traffic isused toevaluate the algorithms based on parameters such as accuracy, precision, andrecall.

Figure7–Network-basedAnomalyDetectionarchitecture

Figure7depictsthedifferentmodulesoftheNetwork-basedAnomalyDetectioncomponentand its interactionwithboth theSIEMplatformand themonitoredapplications. Table 3 lists each module and their roles. More details on thecomponentanditsmodulescanbefoundin[D61].

D7.1

23

Module Role

TrafficHeaderCaptureDatasetinstancestobeanalysedduringthetrainingandpredictionprocess.

EntryPoint

WebservicewherethetrafficcomingfromthemonitoredapplicationsisidentifiedbytheirIPaddressesandportnumbers

Configuration Allows interaction between theapplicationandtheenduser.

TrainingandPredictionUsesmachinelearningtotrainthemodelandmakepredictionsoverthecapturedata

Database(DB)Storesthetrafficcaptureandtheoutputofthetrainingandpredictionprocess.

PredictionServiceProvidestheoutputofthecomponentandinteractswithSIEMsandvisualizationtools

Table3–Network-basedAnomalyDetectionmodules

2.7 Skeptic II (WP6)

The Skeptic extension is an application-based anomaly detector that aims toenhance application security by monitoring user activities. The system uses ahybrid detection approach that combinesUBA (UserBehaviourAnalytics) andRule-basedanomalydetection.WhiletheUBAapproachreliesonmodellinguseractivitypatternsanddetectingactivitiesthatdeviatefromthelearntmodels,theRule-based approach uses rules defined by application experts to detectanomalies. Combining both approaches allows this component to improvedetectionaccuracy.SkeptichasamodularandflexibledesignthatallowsthecomponenttomonitordifferentapplicationusecasesandintegratewithdifferentSIEMplatforms.Figure8depictsthedifferentSkepticmodulesanditsinteractionwiththeSIEMplatformand the applicationmonitored, as described in [D22]. Table 4 lists the Skepticcomponentsandtheirroles.TheSkepticcomponentisanEventGeneratorbasedontheApacheSpark(version2.1.0) and theApacheHadoop (version 2.7.0) frameworks. For the integrationwithSIEMplatformsandotherDiSIEMextensions,SkepticsupportsJSON,LEEFandSYSLOGdataformats.TheSkeptichardwarerequirementsdependonthevolumeofeventsgeneratedbytheapplicationtomonitor.Thehardwareresourcestoallocatewillbemeasuredand fine-tuned during the integration phase. As an example, ongoingimplementations of Skeptic to detect malicious scraping activities on a webapplication with 5 transactions per second show that, when running in batchmode,Skeptictakesaround10minutestoanalyseallHTTPtrafficforaday.

D7.1

24

Figure8–Skepticarchitecture

Module Role

InputIngestApplicationLogsandOSINTRelevantThreatIntelligencedatafeeds

Configuration DefineUBAmodels,Rules,Scoring,Aggregation

Aggregation AggregateandnormalizeapplicationandOSINTdata

BehaviouralEngineTraindetectionUBAmodelsanddetectanomalousapplicationuseractivities

RuleEngine Load/Applydefineddetectionrules

Output AggregatedetectionresultsandwriteapplicationsessionstotheSIEM

Table4–Skepticmodules

D7.1

25

2.8 User Behaviour Analytics Platform (WP5)

The user behaviour analytics platform is an integrated system of three majorDiSIEMcontributions, theVisualUserBehaviourAnalysis (namedU4-VASABI),the User BehaviourModelling, and the TopicModellingBased User BehaviourExtractionandClustering.TheplatformreadsuseractionsorganizedinsessionsdirectlyfromtheSIEMdata,processesitandprovidestoolsforSIEMoperatorstoanalyse this behaviour in detail. The module organization and data flows arerepresentedinFigure9.

Figure9–UserBehaviourAnalyticsPlatformdataflow.

The Topic Modelling Based User Behaviour Extraction and Clustering moduleenables analysts to interactively understand user behaviour clusters based onensembles of topic modelling results. Taking advantage of the LDA (LatentDirichletAllocation)processfromtextmining,wemapthesessionsasdocumentsandeachactioninasessionasaword.Thus,wecangeneratethetopic,i.e.,theprobability clustered results of user behaviours.We provide a visual interfaceshowingthedistributionofinitialbehaviourclusters,thedistributionofactionsineachcluster,andtheoverlapamongselectedclusters.Analystscanunderstandthebehaviour and interactively select suitable clusters by observing howrepresentativetheyareandhowmuchoverlapexistamongthem.Throughsuchan iterativemanner, the analysts are able to better understand the behaviour,generate the behaviour clusters, and then further improve the user behaviourmodelling. The cluster-based behavioural model provides scores for userbehaviour based on predictions using state-of-the-art machine learningtechniques.Thesescoresindicatetheanomalyofauseractionwithrespectto(i)thegeneralbehaviourofusersinthesystem,(ii)thepastactionsofthatspecificuser,and(iii)thebehaviourofsimilarusers.ThesescoresareusedtoannotateactionswithintheU4-VASABImodule.The User Behaviour Modelling module uses long-short-term-memory (LSTM)recurrentneuralnetworkstrainedonuseractionsandthetimingoftheseactions.

D7.1

26

TheLSTMprovidesaprobabilitydistributionoverfutureactionsofthatusergivenher currentbehaviour.Contrasting theseprobabilities to theactuallyobservedactions allows one to deduce an anomaly score of the observed actions. ThisanomalyscorecanbeusedtoflaguserbehaviourdirectlyforaSIEMoperatorortouseitforanin-depthanalysisofauser’sbehaviour.Inordertoobtainmodelsforusersthatbehavesimilarly,aclusteringofusersessionisobtainedfromtheTopicModellingBasedUserBehaviourExtractionandClusteringmodule.TheU4-VASABImodule is thepartof theplatformwherethe interactivevisualanalysisof individual sessionsand theusersarevisually investigated in-depth.Thesesolutionsaredesignedtoaddressanumberofhigh-levelgoalsandanalysistasksthatarecarriedoutbySOCoperatorsandexpertswhomakedecisionsonwhether particular sessions are fraudulent or whether particular users arebehavinginanunusualmanner.TheelementsinU4-VASABIprovide(i)amulti-perspective exploration of sessions through the extraction and exploration ofsession-level metrics and the temporal distribution, (ii) a high-level semanticsummaries of action sequences through the extraction of frequent activitiesthroughpatternminingandthevisualsummariesof theminedpatterns, (iii)amulti-levelvisualisationapproachtoinvestigatemultiplesessionsrevealingtheiractivity type and temporal distributions, (iv) a comparative analysis capabilitythrough theutilisationof expectedness scores, (v) avisualuserprofile tobothsummarise the historical behavioural profile of users and to put individualsessionsincontext,and(vi)aclusterbasedsummarisationofactionstoidentifyunderlying tasks. To be able to deliver these functionalities, the U4-VASABIcomponent involves a combination of computational methods and severalinteractive views. The views are coordinatedwithin each other using a novel,multi-semantic linking approach and are provided as standalone web-basedinterfaces.All the componentswithin theUserBehaviourAnalyticsplatformaredesignedandbuiltasplatformindependent,self-contained,genericsolutionsthatcanworkwithanytime-stampedsequencesoftypedeventsthatareorganisedintosessionsand executed by identifiable agents (i.e., users or automated processes/agentswithagivenID).Allthecomponentsoperateonsuchsession-leveldataandarecapableofaccessingsuchdataeitherasstaticfilesorthroughcallstoadatastore,such as Elasticsearch or any other database. The component is also capable ofwritingadditional/computedinformationtothesedatastores.Thisflexibilitywillenablethecomponenttobeabletoworkbothindependentlyandin-coordinationwithanySIEMthatusesstandarddatastoressuchasElasticsearch.

2.9 SLiCER Cloud Archiver (WP6)

The purpose of the SLiCERCloudArchiver component is to provide long-termstorage foreventsgeneratedbysensors in themonitored infrastructure.TheseeventsaresenttoSIEM,buttheystayinthesystemforalimitedamountoftimeduetostorageconstraints(generally,lessthan6months).Thiscomponentaimsto use cloud storage services such as Amazon S35and Azure Blob6Storage to5https://aws.amazon.com/s36https://azure.microsoft.com/en-us/services/storage/blobs

D7.1

27

securelyarchivetheseeventsforalongerperiod.Increasingtheretentionperiodisimportantbecausemanythreats,vulnerabilities,andzero-daysarediscoveredlongaftertheyentertheinfrastructure.Somezero-dayvulnerabilitiestake3yearsormoretobedisclosed[AblonandBogart,2017].Anotherimportantreasonistosupportlong-termforensicanalysis,whichrequiresaccesstoeventscollectedatthetimetheincidentoccurred.Thecomponentorganizesandstorestheeventsinacloud-of-clouds[Bessanietal. 2013,Oliveiraet al.2016], ensuring security, cloud fault tolerance, andcostefficiency.Moreconcretely,thecomponentwasbuilttoworkontopoftheVAWLTmulti-cloud storage service, 7 being developed by the FCiências.ID team. ThecloudsusedforstoringtheeventsshouldbecompliantwiththeEUGeneralDataProtection Regulation (GDPR)8and any additional legislation applicable to thepersonalorsensitivedata. In theend, theobjective is toeliminatetheneedforSIEMplatformstohavealocalarchivalinfrastructure.Figure10illustrateshowtheproposedcomponentinteractswithexistingSIEMsystems.TheSensors (S)generateeventsand send them toConnectors,whichnormalizeandforwardthemtoSIEM.TheeventsmaybeuploadeddirectlyfromtheConnectorstotheSLiCER(redarrows)atthesametimethattheyarealsosenttotheSIEM.However,thisrequirestheConnectorstohavedirectInternetaccess,whichmightnotbepossibleordesiredintheinfrastructure.ThesecondoptionistohaveonlytheSIEMuploadingtheinformationtoSLiCER(bluearrow).SLiCERreceivestheevents,groupsthemintimeintervals(e.g.,anhour),andcreatesaneventblockthatiswrittentothecloudsusingtheVAWLTservice.Tominimizethecostsofusingcloudstorageservices,itisimportantthatthecomponentorganizesthe events in such a way that typical queries on the event archive avoiddownloading high volumes of data. This requirement comes from the fact thatdownload bandwidth is themost expensive resource on typical cloud storageservices. Therefore,we employ a strategy to efficiently divide event types intoblockstominimizereadsfromthecloudandmaximizetheefficiencyofclouddataretrieval.Thisisdonebygeneratingsmallindexfilesfortheblocksconsideringeventpropertiesnormallyusedforperformingsearches.

Figure10–SLiCERCloudArchiverarchitecture

7https://vawlt.io8https://ec.europa.eu/info/law/law-topic/data-protection_en

D7.1

28

ArchivalqueriesareexecutedthroughaGUIprovidedbythecomponent.Readeventblocksarecachedtobereadlocally,assuchqueriesareusuallypartofagroupofmanyoperations,mostcommonlyusedduringforensicsanalysis.MoredetailsaboutthiscomponentcanbefoundindeliverableD6.1[D61].

D7.1

29

3 SIEM Environments

This chapter describes the SIEM environments of each partner, where thedevelopedcomponentswillbeintegrated.WeaddresstheoperationalcontextofeachSIEMsinceaproductionenvironmentwithrealdataandday-to-dayusagehasradicallydifferentimplementationrequirementsthanatestenvironment.Thetechnicallimitationsofeachplatformareclearlystated,aswellasthestrategytoovercomethoselimitationsandsuccessfullyintegratethecomponents.

3.1 Micro Focus ArcSight (EDP)

TheMicroFocusArcSight,formerlyHPArcSight,istheSIEMsystemusedatEDPbyitsSOCteam.ItisthecoretechnicalsystemaroundwhichtheSOCoperationisorganisedandalsothemainsecurityeventrepositoryforEDP.TheArcSightSIEMplatformdeployedatEDPincludesthefullSIEMstack,consistingofthreemajorcomponentsofitsarchitecture:Connectors,Loggers,andtheEnterpriseSecurityManagement(ESM).TheESMcomponent,alsoknownastheconsoleorcorrelationengine,providesagraphical interface for the SOC team. This module is responsible for eventcorrelation,processingreal-timerulesandtriggeringalerts.FromthisinterfacetheSOCteammonitors,analyses,anddealswithcollecteddata,creatingcustomfilters and rules. All the real-time events are constantly being updated andpresentedindashboards,withseveralvisualizationoptions,suchascharts,tablesornetworkgraphs.The connectors are software components which collect events from sources,eitherthroughpullorpushmethods,normalizingthemintoastandardformat–CommonEventFormat(CEF).Theconnectorsarealsoresponsible for filtering,aggregatingandcategorizingtheevents,whicharethensent, inparallel, to theLoggerandESMcomponents.Thispre-processingiscrucialtosavebandwidthandimprovetheefficiencyoftheeventcollectionprocess.TheArcSightLoggerisauniversallogmanagementsolutionwiththepurposeofoptimization. The extreme high event throughput, efficiency in the long-termstorage and the rapid data analysis are themain benefits of using the Logger.Moreover,theLoggerservesasasecureeventstorageforforensicanalysisand,ifneedbe,topresenttoauthoritiesincaseoflegalprocedures.EDPmaintainstwoSIEMinstances:theproductionenvironmentonwhichtheSOCoperatorsworkandthetestenvironment,wherenewconnectorsandcomponentsareevaluatedandassessedbeforebeingdeployedtotheproductionenvironment.Bothenvironmentsarebasedonexactlythesameproductandversion,makingitpossible to have a meaningful testbed. The differences between the test andproduction environment are the number of connectors and the computationalresourcesavailable,bothofwhicharegreaterintheproductionenvironment.AllcomponentintegrationsinthecourseoftheDiSIEMprojectwillbeaddressedfirst in the test environment and, if the tests are successful, theywill then bedeployed in the production environment. The criteria to determinewhether a

D7.1

30

component is allowed to be implemented in the production environment isdividedintofunctional,technicalandproceduralvectors.Firstandforemost,thedeveloped components must be aligned with EDP security policies, legal andregulatory obligations and technical specifications. The developmentsmust bevalidatedbyEDP’ssecurityteam,includingavulnerabilityassessmentandloadtestsforqualityassurance.Lastly,theintroductionofthenewcomponentsmustnothaveanegativeimpactontheSOCoperation,followingthebasicprinciplesofDiSIEM that state minimal operational and integration overhead as the mostrelevantdriversfortheproject.

3.2 XL-SIEM (ATOS)

TheCross-LayerSecurityInformationandEventManagementtool(XL-SIEM)isanenhancedsecuritydataanalyticsplatformwithahigh-performancecorrelationengine able to raise alarms from a business perspective, considering eventscollectedatdifferentabstractionlayers.TheXL-SIEMisnotacommercialSIEMliketheMicroFocusArcSightusedbyEDP,noranopensourceplatformlikeElasticsearchusedbyAMADEUS.Instead,theXL-SIEM is a platform for research and development of new features andfunctionalities on SIEM environments. It is used to test and evaluate internaldevelopments performedwithinAtos, providing great benefit, since thewholeenvironmentiscontrolledanddevelopedinternallyattheAtoscybersecuritylab.Thesystemiscomposedbyasetofdistributedagents,responsiblefortheeventcollection,normalizationand transferprocesses; anengine, responsible for thefiltering,aggregation,andcorrelationoftheeventscollectedbytheagents,aswellas thegenerationof alarms; adatabase, responsibleof thedata storage; andadashboard,responsibleforthedatavisualizationinthewebgraphicalinterface.The XL-SIEMAgent is the resource layer component responsible for the eventcollection,normalizationandtransfer to theXL-SIEMEngine for itsprocessing.Eventsaregeneratedbydifferentsensors(e.g.,networktraffic,honeypotevents,intrusion detection systems) deployed on the customer's monitoredinfrastructure.TheXL-SIEMEngine is thecomponentontheback-end layerof themonitoringarchitectureresponsiblefortheanalysisandprocessingoftheeventscollectedbytheXL-SIEMAgents,andthegenerationofalarmsbasedonapredefinedsetofcorrelation rules or security directives. XL-SIEM is implemented in a Storm9topologyrunninginanApacheStormcluster10whichallowsittotakeadvantageofthescalabilitybenefitofthisarchitecture.TheCorrelationEngineisthecoreoftheXL-SIEMengineandintegratestheopensourcehigh-performance correlationengineEsper.11The correlatorusesEventProcessing Language 12 (EPL), which allows a flexible definition of complexcorrelationrules.ItisaSQL-likelanguagethatincludesforexamplethedetection9http://storm.apache.org/index.html10http://storm.apache.org/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.html11http://www.espertech.com/esper12https://docs.oracle.com/cd/E13157\01/wlevs/docs30/epl\guide/overview.html

D7.1

31

of patterns, the definition of datawindows or the aggregation and filtering ofincomingeventsintomorecomplexevents.TheXL-SIEMdatabaselayertakesadvantageoftheOSSIMplatformandstoragecapabilities.Moreover,itincludesthecapabilitytostorebothevents(gatheredbytheagents)andalarms(generatedbytheserver)inanexternaldatawarehouseusingaRabbitMQserver.TheformatsupportedtosendeventsandalarmsinthiscaseisJSON.The XL-SIEM web graphical interface integrates the following visualizationcapabilities:(i)differentgraphicalchartsareshowntotheuserasanoverviewofthemonitored system status; (ii) alarms, security events and raw logs can bevisualizedbytheuser;(iii)additionalinformationprovidedbyopensourcetools(e.g., Netflow traffic detected by Fprobe, vulnerabilities detected by Nessus orOpenVAS) is integrated inthegraphicalinterface;(iv) it ispossible togeneratePDFreportswithasummaryoftheSIEManalysis.AtosprovidesatestenvironmentcomposedbyaninstanceoftheXL-SIEMAgent(disiem-agent),theXL-SIEMengine,theXL-SIEMdatabaseanddashboard.ThereisnoproductionenvironmentdefinedinAtosfortheDiSIEMproject.Allresearchand development work are performed in the test environment, where newconnectorsandcomponentsareevaluatedandassessed.AtoshavesetupatestbedcomposedoftwoIntrusionDetectionSystems(IDS)–SnortandSuricata;twoapplications–OwnCloudandGitlab;oneXL-SIEMagent;andtheXL-SIEMengineaimingatcapturingthetrafficcomingfrominternalandexternalsourcesandlabellingitasnormaloranomalous,basedonthesourceIP.OwnCloud,Gitlab,andtheXL-SIEMAgenthavebeeninstalledinseparateVirtualMachines,eachonecontainingbothSnortv2.0.6andSuricatav3.2.4agents.ThetwoapplicationsandtheXL-SIEMagentarehostedinthesamenetwork,whereastheXL-SIEMengineisisolatedinadifferentnetworkprotectedbyafirewall,asshowninFigure11.

D7.1

32

Figure11–Atostestbed

The snort instance used to monitor the OwnCloud server was configured with only community.rules downloaded from the Snort site with all the rules uncommented. To monitor the GitLab server we have additionally included the registered snort rules files (as they are downloaded by default, with some of them commented). The internal subnet is composed of company users that accessOwnCloud andGitlabapplicationsregularlyandwhosebehaviourisconsiderednormalaslongastheyconnectfromwork.Anyconnectionperformedbyanoutsider(anetworkoriginoutsideoftheorganizationnetwork)isconsideredtobemalicious.Table5summarizestheplatformandsoftwarerequirementsfortheAtostestbed.Component Basesoftware Additional

requirementsDiSIEM-1OwnCloudServer

• Ubuntu• OwnCloudserverv.10.0.1• OSSECagent• Snortv.2.0.6• Suricatav.3.2.4

• Basicapplicationsimulatingloginpage

• Severaluserswithdifferentprivileges

DiSIEM-2GitLabServer

• Ubuntu• GitLabv.10.2.2• OSSECagent• Snortv.2.0.6• Suricatav.3.2.4

• Basicapplicationsimulatingloginpage

• Severaluserswithdifferentprivileges

D7.1

33

DiSIEM-FInternalfirewall+SIEMAgent

• Ubuntu• Netfilter• NIDS(Snort/Suricata)

• ConnectiontoXL-SIEMserver

XL-SIEMAgent • Ubuntu• Snortv.2.0.6• Suricatav.3.2.4RSyslog

• ConnectiontoXL-SIEMengine

• Snortcommunityrules

• SuricatadefaultrulesDiSIEM-3XL-SIEMEngine

• Debian• XL-SIEMserver• OSSECserver• OSSIM5.2.3(lastversion)• ApacheStorm1.0.2

• ConnectiontoCloud

platformforlong-termstorage

• ConnectiontoOSINTdata

AttackerMachine • Kali

• Runningoutofthetestbed

Table5–ATOSTestbedcomponentsandrequirements

3.3 Elastic Stack (AMADEUS)

ElasticStackisagroupofopensourceproductsbytheElasticcompany,designedto take data from any type of source, in any format and search, analyse, andvisualizethatdatainnear-realtime.Asmentionedin[D21],theElasticStackisnotaSIEMplatform,butoffersmostofSIEMcapabilitieswithgreaterflexibility,owinginparttoitsopensourcenature.Elasticsearch is an open-source, broadly-distributable, readily-scalable,enterprise-grade search engine. Accessible through an extensive and elaborateAPI, Elasticsearch can power extremely fast searches that support your datadiscoveryapplications.The Elastic Stack is used by Amadeus’ Application SOC team to monitor useractivitiesondifferentAmadeusapplications.TointegrateanewextensiontotheElasticStackplatform,anadministratorhasdifferentoptions,dependingonthetypeoftheextension:

• Elasticsearch REST API: The API is highly flexible and allows CRUDoperations,searches,aggregations,dataandplatformmanagement;

• ElasticStackdataingestionmodules:Logstash,BeatsEventGenerators;• Native integration with Elastic Stack components by creating custom

plugins.MostElasticStackcomponentssupportcustomplugins:Logstash,Kibana,etc.

TwoElasticStackPlatformsareusedbytheSOCteam:aproductionplatformandatestplatform.TheSOCteamatAmadeusisresponsiblefromtheimplementationandvalidationoftheDiSIEMcomponentsdeployedintheElasticStackplatform.

D7.1

34

Aswithotherdevelopments,allcomponentsmustbeinitiallydeployedinthetestenvironment and only then transferred to the production environment, if thevalidation is successful. An extension is validated and can be transferred toproduction,ifithasnoperformanceissuesandbringstheaddedvalueexpectedtoapplicationmonitoring.InadditiontotheElasticStackplatforms,AmadeushasGeneralPurposeplatformsfortestandproductionenvironments,usedtoruntoolsinteractingwiththeSIEMplatform without being natively integrated. The tables below summarize thedetailsofbothtestandproductionenvironments:

Platform Machines OS Software

General Purpose 2 VM Ubuntu 16.06, 100 GB

Disk, 16GB RAM, 4CPU Python 2.7/3 Java 8u171

Hadoop/Spark 3 VM Ubuntu 16.06, 100 GB Disk, 32GB RAM, 8 CPU

Spark 2.1.0 Hadoop 2.7.0 Java 8u171

SIEM 5 VM Ubuntu 16.04, 100 GB Disk, 16GB RAM, 4CPU Elastic Stack

Table6–Amadeustestenvironment

Platform Machines OS Software

GeneralPurpose 4VM Ubuntu16.06,150GBDisk,

16GBRAM,8CPUPython2.7/3Java8u171

Hadoop/Spark 6VM RedHatEnterpriseLinux7,2TBDisk,64GBRAM,32CPU

Spark2.1.0Hadoop2.7.0Java8u171

SIEM 10VMRedHatEnterpriseLinux7,5TBDisk,128GBRAM,

64CPUElasticStack

Table7–Amadeusproductionenvironment

3.4 Environment Validation

The component integration with the 3 available SIEM environments will beconductedusingasamplingapproach,asplanned,sincethecompleteintegrationandvalidationofallcomponentsinallSIEMplatformswouldrequireanamountofresourcesnotavailableintheproject.BymakingsurethatmostcomponentsareintegratedwithatleasttwoSIEMplatforms,andconsideringthefundamentaldifferencesbetweenallSIEMenvironments,onecaneffectivelydemonstratetheability to deploy the component in a flexiblemanner. Table 8 summarizes ourplanned integration forcomponentvalidation.Throughoutthevalidationphasesome adjustments may be made if integration issues occur or if the partnersdecidethatthecomponentmustbevalidatedinadditionalenvironments.

D7.1

35

Component TestEnvironments

Multi-levelRiskManager Amadeus,EDP

DiversityandForecastingAnalyticsPlatform Amadeus,Atos

OSINTThreatDetector Amadeus,Atos,EDP

Listening247ThreatAnalyser Amadeus,Atos,EDP

ThreatIntelligenceIntegrator Amadeus,Atos,EDP

Network-basedAnomalyDetector Atos

SkepticII Amadeus

UserBehaviourAnalyticsPlatform Amadeus

SLiCERCloudArchiver Atos,EDPTable8–Componentvalidationtestenvironments

Ontopofthecomprehensivedescriptionofcomponentinterfacesincludedinthisdeliverable,thevalidationoutputswillemphasizethemethodsusedtoguaranteetheadaptabilityofthoseinterfaces.

D7.1

36

4 Integration Plan

This chapterdefines the required interfaces (inputandoutput) for componentintegration. The components presented in Chapter 2, initially developed andimplemented in an isolated manner, will be integrated with the SIEMenvironments described in Chapter 3. Additionally, the interfaces betweencomponentswillalsobevalidated,makingsuretheoperationalcontext isvalidandseamlesslyworkingasawhole.Wewillpresent the integration interfacesas describedand represented in theDiSIEM architecture. It is relevant to note that not all components have directintegrationwiththeSIEM,asinthecaseoftheUserBehaviourAnalyticsPlatform.

4.1 Integration with SIEM

ComponentintegrationwiththeSIEMsystemisoneofthecoreactivitiesforthevalidation phase of the project. This integration can be accomplished directlybetween the developed component and the SIEM or in an indirect manner,through another component. In this section we focus on direct integrationsbetweencomponentsandtheSIEM.

4.1.1 Multi-level Risk Manager

TheMulti-levelRiskManagercomponenthasabidirectionalintegrationwiththeSIEM platforms. The DBImport module of this component will ingest asset,vulnerabilityandincidentdatafromtheSIEM.Thedesignedmethodistousethefile export capabilities of the SIEM in order to have the developed moduleaccessinga fileshareorsFTParea,processing itand importingthedatato thecomponent’sdatabase.TheresultsoftheriskassessmentcanbeexportedtotheSIEMandaDBExportsub-component was developed for that purpose, using the same fileexport/importmethod. However, wewill also implement amore efficient andseamless integrationtakingadvantageofavailableSIEMconnectors,havingtheSIEMaccessingthedatabasedirectlyandpullingtherelevant information fromthedatabasetotheSIEM,bypassingtheneedforafilerepository.

4.1.2 Diversity and Forecasting Analytics Platform

TheDiversityandForecastingAnalyticsPlatformtakeseventdataextractedfromtheSIEMdirectlyfromtheconnectors,forinstanceusingAPIcalls,performstheanalysis on its dashboard sub-component, calls its engine sub-component forforecastinganddisplaystheresultsbacktoitsdashboard.

4.1.3 Threat Intelligence Integrator

We’ll now detail, for each SIEM platform used by the DiSIEM partners, thedesigned integration plans for the OSINT Data Fusion & Analysis components,bearinginmindthattheyfollowtheDiSIEMprinciplesstatedin[D22],especiallythe one affirming that SIEM platforms should not be modified due to ourextensionsandnoadditionalorsignificantmanualworkshouldbe required tooperatethem:

D7.1

37

• XL-SIEM(ATOS):ATOSXL-SIEM,canexportand importevents through

theRabbitMQ13MessageModel.Forthispurpose,anewMISPmodulewillbe added to the MISP instance of the Threat Intelligence Integrator,allowinghimtocommunicatewiththisSIEMusingthismessagebroker.Themodulewillbealsoinchargeofhandlingtheconversionbetweenthedifferent standards used by MISP and the XL-SIEM for representingcybersecurityevents;

• ArcSight (EDP): EDP ArcSight is a SIEM widely used in industrialenvironments.Indeed,MISPcomesoutwithaspecificmoduledevelopedfortheintegrationwithArcSightitself,exchanginginformationusingtheCEF format, supported by this SIEM. Another option could be lettingArcSight interact directly with the Threat Intelligence Integrator, forpulling/pushing specific events from the MISP database, relying on theREST API that MISP itself provides. In this case, the python libraryPyMISP14couldbeusedformakingtheintegrationprocesseasier;

• Elasticsearch(AMADEUS):AMADEUSisrelyingonalocalMISPinstance,for their internal usage. The easiestway for performing the integrationwouldbetosynchronizetheinstanceoftheThreatIntelligenceIntegratorwiththisone,toexchangeinformationinanautomatedway,throughthebuilt-in information sharing capabilities of MISP. Alternatively, as forArcSight, Elasticsearch could pull/push data directly from the MISPdatabase, or through some custom import/exportmodule added to theinstanceoftheThreatIntelligenceIntegrator.

4.1.4 Diversity-enhanced Monitoring

SkepticIIisanEventGenerator,runinadifferentplatform,writingaggregatedapplicationusersessionstotheSIEM.SkepticIInativelysupportsJSONformatandisshippedwithanElasticStackconnector.CustomSIEMconnectorscanbeaddedtotheSkepticoutputmodule.

4.1.5 SLiCER Cloud Archiver

SLiCER has a passive interaction with the SIEM platform, simply receivingforwardedevents.TointegratewithSLiCER,itisnecessarytoconfiguretheSIEMto forward security events to the defined IP address and network port,whereSLiCERwillbe listening.The configurationmaybe changed in the “Slicer.conf”configurationfile,eitherdirectlyintheSLiCERfolderorthroughitsuserInterface.

4.2 Component Integration

Some of the developed components were designed to be tightly integrated,continuouslyexchanginginformationinordertoenableimprovedSOCoperations.Inthissection,wedescribehowthesecomponentensemblescanbeimplementedand validated. Figure 12 represents the DiSIEM components, accentuating thecomponentswithactiveinterfacesandnumberingthoseinterfaces.

13https://www.rabbitmq.com/14https://www.circl.lu/doc/misp/book.pdf

D7.1

38

Figure12–DiSIEMcomponentinterfaces

4.2.1 Diversity and Forecasting Analytics Platform

TheDiversityandForecastingAnalyticsDashboard retrieves thedata from theSIEMandcallstheEngineforthepredictions,usinginterface1.ThisfunctionalityisimplementedusingaRESTAPI.Themodellingandanalysisfunctionalities,thatare part of the Engine, are called from a prototype Python Flaskweb service,offering onemain endpoint, taking input data as a CSV or JSON and returningresultsasJSON.ThemodellingcodeintheEnginecanalsoberunasself-containedmodulesandcanstreamresultsoutinsyslogformat.

4.2.2 OSINT Data Fusion and Analysis Components

TheintegrationwithOSINTdatafusionandanalysistoolswillrelyonMISP.Boththe Listening247 Threat Analyser and the OSINT Threat Detector can providestructuredIoC,representedusingtheMISPJSONformat,totheThreatIntelligenceIntegrator,associatedtotheOSINTsourcestheyareusing.Theseinterfacesareidentifiedasnumber2and3,respectively.Theintegrationplanswillconsiderthefollowingguidelines:

- Listening 247 Threat Analyzer (DigitalMR): the Listening247 ThreatAnalyzer isnotusingMISP. It isnecessarytocreateauser,with limitedprivileges,intheMISPInstanceoftheThreatIntelligenceIntegrator.Inthisway, an authentication key has been generated, allowing this user to

D7.1

39

directly interacting with the MISP Database, for pulling/pulling eventsremotely,helpedbythePyMISP15library;

- OSINTThreatDetector(FCiências.ID): theOSINTThreatDetectorwillalsorelyonaprivateMISP Instance.This simplifies the sharingprocesswiththeThreatIntelligenceIntegrator,whichcanbeperformedsimplybysynchronizingthetwoMISPinstances,asexplainedintheMISPguide.16

Moreover,forimprovingtherepresentationofthesharedIoCsets,specificcustomMISPtaxonomieswillbecreatedandintegratedintheMISPinstanceoftheThreatIntelligenceIntegrator.Thiswillalsohelptheanalysismadebythelatter,duringthethreatscorecomputation.

4.2.3 Skeptic II

Figure13details thecommunicationphasesbetweenthetwocomponents.TheUserBehaviourAnalysiscomponentsingestdatafromtheSIEMextensionSkeptic.The output of these components is displayed via the VASABI visualisationextension.

Figure13SkepticIIintegrationwithUserBehaviourAnalyticsPlatform

15https://media.readthedocs.org/pdf/pymisp/master/pymisp.pdf16https://www.circl.lu/doc/misp/book.pdf

D7.1

40

5 Evaluation

Inordertoaccuratelyvalidatethedevelopedcomponentsandtheoverallsuccessof theDiSIEMproject, it isvital todefinecriteria thatallowstheconsortiumtomeasuretheeffectivenessoftheplannedactivities,beforetheyarestarted.Thissection defines some preliminary success criteria for component evaluation,alignedwiththeDiSIEMprinciplesandmethodologydescribedinSection1.1.TheevaluationitselfwillbeperformedduringtheintegrationtasksandreportedindeliverableD7.3(validationresults),onM36.AsdescribedinChapter3,thecomponentintegrationwillusetheavailableSIEMsystems, being firstly implemented in the test environments. In order to bedeployedintheproductionenvironment,thecomponentmustsatisfythesuccesscriteriadefinedinSection5.1.Thisqualityassurancemechanismisessentialtoguaranteethatthecriticalproductionenvironmentsofindustrialpartnersareinnowaynegativelyimpactedbypossiblebugsinthecomponents,aswellasmakingsurethatthecomponentshaveproventoberelevantfortheSOCoperationsandthattheydonotharmtheSIEMperformanceinanyway.

5.1 Success Criteria

The purpose of each component was detailed in Chapter 2, presenting themotivationforthedevelopmentandintegrationofthecomponentswiththeSIEMplatforms.Firstandforemost,thevalidationphasewilldetermineifthepurposeof the componentwas fulfilledandcompliedwith theexpectedoutcomes.Thisfunctionalanalysisisobviouslythemostrelevantcriterionandwillprecedeanyotherevaluation.ThesuccesscriteriaparametersarepresentedinTable9.Category Description SuccessCriteria

Functional Componentcapabilities

Thecomponentoperatesasexpectedandprovidesthenecessaryoutputs,enablingimprovedSIEMcapabilities

Functional SOCoperationimprovements

TheSIEMend-usersmustevaluatethebenefitsofintegratingthecomponentintheSIEMplatform.Performanceimprovementsshouldbemeasuredwheneverpossible

Technical ComplexityofSIEMintegration

ThecomponentmustbeintegratedconsideringtheDiSIEMprinciplesofleastimpact.TheSIEMplatformmanagersmustvalidatethattheintegrationeffortrequiredisadequate,consideringthebenefitsofusingthecomponentandtheimprovementsfortheSIEMplatform

D7.1

41

Technical Componentadaptability

OneoftheDiSIEMcornerstonesisthatalldevelopmentsmustbeconductedinordertoallowthecomponentstobeimplementedinmultipleSIEMplatforms.ThecomponentdevelopersmustmeasurethepercentageofthecomponentthatneedstoberedesignedoradaptedforadifferentSIEMplatform,todeterminetheadaptabilityofthecomponent

Functional Relevanceofthedevelopment

WhiletheteamsoperatingtheSIEMareabletogivedirectandimmediatefeedbackregardingthecomponents,themostrelevantsuccesscriterionistodetermineifthedevelopmentscontributedtoimprovetheorganizationcybersecuritycapabilities,includingforC-levelmanagementroles

Table9–Evaluationsuccesscriteria

5.2 SBD Validation Scenarios

Inthissection,wewillapplytheScenarioBaseDesignmethodologytodefinesomevalidation scenarios for one of our components: Skeptic II. Other componentsvalidationwillfollowsimilarideas.The first two phases of the SBD approach (Requirements and Design)will becovered in this report. The last phase (Evaluation)will be covered in the lastvalidationdeliverable.During theAnalysisPhase, a requirementanalysis isperformed.The idea is tounderstandtheproblemathandandidentifyareastoimprove.Variousmethodsareusedtostudythesituationandthecontext.Typically,thesemethodsincludeinterviewwithstakeholders,studiesofthecurrentsituationanditsimplementedsolutionandbrainstormingforothersolutions.Thefindingsweregatheredandanalysedinordertocomposethescenariosthatreflectimportantfeaturesoftheproblem, the typical and critical tasks of the users, the tools they use and theorganisationalcontext.FollowingtheValidationMethodologydescribedinSection1.1,weenvisiontwovalidationscenariosfortheSkepticIIextension:

• Scenario1:UserActivityDiscoveryandExploration• Scenario2:SuspiciousUserActivityDetection

Scenario1:UserActivityDiscoveryandExplorationAsecurityanalystfromtheSOCteamisperformingmanualinvestigationinordertoanalyzepotentiallysuspiciousactivitynoticedbyacustomer.Heislookingtofirstconfirmthesuspiciousactivityandtheninvestigateitsorigin.Theanalystisperformingsemi-automated investigation, consistingofextractingdifferentapplicationlogsfromvariouslocations,joiningthemandreconstructingthefull

D7.1

42

useractivityflow.Toperformthistask,theanalysthastoaccessthelogarchivenetworkshares,identifytherelevantapplicationlogarchivesandthenperformsearchesandparsingusingdifferenttools.Afterextractingtherelevanteventsfor the period provided by the customer, the analyst notices that he cannotconfirmtheuseractivityseeninthelogsissuspicious,heneedstoextractmorelogstocomparetheuseractionsseenwithhispastactions,butalsocompareitwithotheruseractivitieswithsimilarprofile.The analyst is particularly interested in the login time and location patterns,frequentactions,andactivityspikes,tochecktheuserhistory.The securityanalystneeds toprocess largeamountsof logs, so thework soonbecomestediousandtimeconsuming.HedecidestousetheElasticStacktoindexall relevant application events andexplore the results visually usingKibana.With Kibana dashboards built, the analyst quickly discovers that the userconnectiontimeandoriginchangedforthelastmonth,buthecannotconfirmthattheuseractionsareillegitimate.Hequicklyrealizesthatheneedsmoreadvancedvisualizationcapabilitiesmainlytomakesenseoftheuseractionflow.He suspects the user is performing fraudulent actions, but he needs moreinformationaboutthefunctionalflowoftheapplication.Hegetsintouchwiththeteaminchargeoftheapplicationtobetterunderstandtheactionflow.Finally,heisabletoconfirmthattheuseractivityisfraudulentandthattheuserwas exploiting a loophole in the application design. He opens a ticket and asubsequentpatchwasappliedtotheapplicationandthecustomerwasnotified.Although, the issuewas fixed for the application in question, no changesweremadetothedifferentdiscoveryand investigationtools.Theanalystknowsthatnothavingafeedbacklooptoimprovetheefficiencyoftheinvestigationtoolsfromthe results of the past investigation is sub-optimal. Table 10 summarizes theclaimsforScenario1:

Claim Pros(+)/Cons(-) Category

Claim1.InvestigationApproach

(+)analysthasknowledgeaboutcertainfeaturesandmetricsthatcanindicatefraudulentactivities(+)theprocesstodetectsuspiciousactionsissemi-automated(-)theprocessistime-consuming(-)nopossibilitytorunadetectionapproachformultipleapplicationinaconfigurablemanner

Analytics

Claim2.Visualizationscapabilities

(+)thesecurityanalystisablevisualizeapplicationeventswithKibanaforbetterunderstandingofuseractivities(-)noadvancedvisualizationstoreflectthefunctionalflowofanapplication,peer/groupsanalysis.

Visualizations

D7.1

43

(-)noKibanavisualizationsincorporatedincentralizeddashboards

Claim3.FunctionalKnowledge

(-)theanalystishighlydependentonfunctionalknowledge;thisslowsdowntheinvestigation.

Dependencies

Claim4.Userhistory

(+)theanalysthasanoverviewofuserfrequentbehavior(-)partialuserhistory

Analytics

Table10–Scenario1Claims

In theDesign Phase, the project ismoved from problem understanding to theenvisionedsolutions.Developersstartbywritingdownthescenarioinwhichtheyenvision typical or critical services that people sought from the system andintroducedconcreteideasaboutthenewfunctionality.For Scenario 1, the analystwould like to perform the same type of suspiciousactivity discovery and exploration in a more efficient way. When he receivesnotificationforanalertforaspecificapplication,hewouldliketohaveinplaceatoolthatwouldallowhimtohaveallrelevantinformationforhisinvestigationgatheredinoneplace.The analystwants focus on the actual investigation and not on log extractionparsingorbuildingvisualizations.Thetoolwouldallowtheanalysttoselecttheapplication to investigate, the locationof logsto ingest, the session features toextractandthetimerangehe is interested in.Thenthetoolswouldgathertheneeded information and present it in a centralized visualization tool, such asKibana, butwithadvanced visualization capabilities to allow the analyst tobetterunderstandtheflowevenifheisnotfamiliarwiththeapplicationitself.TheanalystwantsthetooltoallowaseamlessintegrationofOSINTdatafeedssuchasIP/Domainreputation,knownvulnerabilitiesdatabases,targetedattackcampaigns,etc.Thetoolwouldallowthedefinitionofcustomrulestoreflectexpertknowledgehelping theanalystunderstanding theuser flow.Finally, the toolwouldhaveafeedbackloopallowingtheanalysttoincorporatetheknowledgegatheredfrompreviousinvestigationsintothetools,thisenrichingitsfinaloutputresult.Scenario2:SuspiciousUserActivityDetectionInadditiontotheinvestigationanddiscoverytasksthesecurityanalystperformson a regular basis, he receives alerts from some continuous user activitymonitoringtoolsthatheneedstocheck.Mostofthemonitoringtoolsforsuspiciousactivitydetectionareeitherrulebasedorusesimplestatisticalmodelsonbasicapplicationsessionmetrics.Althoughhavingsomebasicmonitoringinplaceimprovesthechancestodetectandstopfraudulentuseractivities,theanalystknowsthatthedetectionrulesandmodelsarenotfrequentlyupdatedandcannotcovermostofthepotentialattackscenarios.

D7.1

44

Mostmonitoringtoolsusebasicapplicationlogeventsanddonotenrichthedataingested, therefore theanalyst is stillobligedtoperformmanual lookups fromvariousOSINTsourcestofurtherinspectthealertsreceived.Also,theuserhistoryisrarelytakenintoconsiderationbythemonitoringtools,thus the analyst is still obliged to manually gather more data to build anapproximate profile of theuser activity patterns. Furthermore, alerts from themonitoringtoolsareusuallypresentedinsimpledashboardsthatlackadvancedvisualisationscapabilitiesandproperreportingtools.Theanalystisunabletodisplayalltheinformationheneedsinaconciseandfriendlymanner.Table11summarizestheclaimsfromScenario2:

Claim Pros(+)/Cons(-) CategoryClaim1.Reportingtools (-)Noproperreportingtoolinplace Analytics

Claim2.Visualisationscapabilities

(+)SomevisualizationispossibleinKibana(-)thevisualizationsordashboardsdonotcontainalltheneededinformation(-)thevisualizationsanddashboardsarealsonotfrequentlyupdated(-)moreadvancedvisualizationsneeded.Thesevisualizationsshouldcontainconciseinformationabouttheusersessioninareadableformat.

Visualizations

Claim3.Rules

(+)Someautomatedrulesareinplace(-)therulesareappliedonsimplefeatures(-)nocomplexruleshavebeenderived(-)significantamountoftimetotesttheefficiencyofanewrule(-)rulesarenotupdated

Analytics

Claim4.DetectionApproach

(+)analysthasknowledgeaboutcertainfeaturesandmetricsthatcanindicatefraudulentactivities(+)theprocesstodetectsuspiciousactionsisautomated(-)theprocessistime-consuming(-)UEBAcanbeappliedtodetectnewsuspiciousactivitypatterns(-)noadvancedfeatures/metricsareusedforthedetection(-)manyapplicationstomonitorinparallel(-)needtohavethepossibilitylaunchcustomdetectionmodels/featuresontheapplication.

Analytics

Claim5.Userhistory (-)userhistoryisnotconsidered Analytics

D7.1

45

Claim6.OSINTdatafeedrelatedtoapplication

(-)needtoenrichthecurrentusersessionwithexternaldatarelatedtoapplication(e.g.OSINTfeeds)

OSINTfeed

Table11–Scenario2Claims

Ideally,thecontinuoususeractivitymonitoringtoolswouldincludebotharule-basedandabehaviouralbaseddetectionmechanism.The rule-baseddetectionmechanism should allow for flexible rulemanagement including frequent ruleupdatesandfinetuning.ThebehaviouralbasedmechanismshouldallowthedefinitionofadvancedUEBAmodelsoncustomsession featuresandshouldbe frequentlyupdatedandfine-tunedtoreflectthecurrentuseractionpatterns.Thedetectiontoolsshouldalsoallowtheanalysttohavealltheenrichedresultsaccessible from a centralized visualisation tool with advanced visualizationscapabilities,allowingtheSOCteamstohaveaccuratereportsandafine-grainedunderstandingofuseractivities.The detection tools need to allow for an easy integration with external datasourcessuchasOSINTdatafeeds,GeoIPlookupdatabases,etc.Finally,anidealuseractivitymonitoringtoolshouldbehighlyflexible,allowingtheusertospecifyallparametersspecifictoaparticularapplicationmonitored:inputloglocation,UBAmodels,sessionfeatures,UBAmodelparameters,Rules,session output location, score aggregation parameters, etc. Such scenarios andclaimswillbere-evaluatedinthenextdeliverable,andtheprogresswillbenotedandmeasured.

5.3 Ethical and Legal Aspects

SIEM platforms are used by organizations to collect, process and store largequantitiesofsecurityevents,which includemultiple typesof information fromnumeroussources.Whilemostof thecollected informationcanbeclassifiedastechnical data, the volume of private information, more specifically personalinformation,isalsorelevant.ThecomponentsdevelopedinDiSIEMwillcollectandprocessdatathatislegallyprotected,forinstancebytheGDPR.17Partoftheinformationisoriginatedintheorganization’sinternalsystems(e.g.,useractivity,Internetaccesslogs,physicalaccesslogs)butthereisalsoinformationbeingcollectedfrompublicnetworks,mainlytheInternet(e.g.,Twitterdata,threatintelligencedata).Regardlessoftheinformation origin, the developers must take into consideration privacyrequirementswhendesigningandimplementingthecomponents.

17https://ec.europa.eu/info/law/law-topic/data-protection_en

D7.1

46

Duringthecomponentintegrationandevaluationphase,ethicalandlegalissuesmayarise.Whileeachpartnerorganizationisabletoaddresslegalquestionstointernallegaldepartments,itisourobjectivetomaintainpermanentcontactwiththeEthicsadvisordesignatedfortheproject.Wewilladdressanydoubtsinthisfieldtotheadvisor,makingsurethatcomponentimplementationwillnotleadtoanydoubtfuldecisionsregardinginformationprocessinganddissemination.

D7.1

47

6 Conclusions and Future Work

In this reportwe summarized theDiSIEMpreviously defined architecture anddetailed the role of each developed component, stressing their most relevantcharacteristics, objectives and interfaces. The validation environments madeavailable by the three industry partners (Amadeus, Atos and EDP) were alsodetailed,allowingforaclearervisiononhowthecomponentdemonstratorswillbeevaluated.Thesuccesscriteriafortheintegrationandvalidationphaseswerelaidout,defining thekeyaspects thatmustbeconsideredwhenevaluating thereadinessandrelevanceofthecomponents.Webelievethatbyintegratingthecomponentsinthepartner’soperationalSIEMenvironments,DiSIEMwillprovidesignificantinsightsontheirrelevancetoSOCoperations, making the transition from theoretical models to real-worldimplementations,preciselyasdefinedintheprojectgoals.ThisdocumentwillbeusedasareferencefortheintegrationactivitiesofDiSIEMthroughouttherestoftheproject.Considering that the components will be integrated in the corporateenvironments of industrial partners, there are some inevitable limitationsregarding the dissemination of results. The SIEM platform is, by definition,responsibleforprocessingandstoringcriticalinformation,includingdetailsforsecurityincidentsandvulnerabilities.SomeofthecomponentswillexploreSIEMdataandpredictablyaccessconfidentialinformation.Therefore,whenreportingoncomponentevaluation,itisnecessarytoredactpartoftheinformation,withoutcompromisingtheabilitytoassesstheprojectresults.One of the consortium’s aspirations is to include relevant stakeholders in thecomponentimplementationandintegration,maximizingthevalidationscenariosand welcoming useful feedback to improve the components. Each industrialpartnerwilladdressthisobjectivebyinteractingwithbothinternalandexternalstakeholders,whichmay include clients,differentbusinessareasandpartners.ThemoststraightforwardexampleswouldbetoincludeSOCclients,internalorexternal, and SIEM maintenance teams, especially if they include external orsubcontracted resources. Results dissemination will follow the methodologydefinedfortheDiSIEMproject,asdescribedinWP8deliverables.

D7.1

48

References

[AblonandBogart2017]L.AblonandA.Bogart.ZeroDays,Thousandsof Nights:The Life and Times of Zero-Day Vulnerabilities and Their Exploits. RandCorporation, 2017.[Bessanietal.2013]A.Bessanietal.DepSky:dependableandsecurestorageinacloud-of-clouds.ACMTransactionsonStorage(TOS),9(4),12.2013.[Carroll 2000] J. M. Carroll. Making use: scenarios and scenario-based design.Proceedingsofthe3rdconferenceonDesigning interactivesystems:processes,practices,methods,andtechniques(DIS’00).2000.[Oliveiraetal.2016]T.Oliveiraetal.ExploringKey-ValueStoresinMulti-WriterByzantine-ResilientRegisterEmulations.Proceedingsofthe 20th InternationalConferenceonPrinciplesofDistributedSystems (OPODIS’16).2016.[D21] DiSIEM Consortium. In-depth Analysis of SIEMs Extensibility. DiSIEMProjectDeliverable2.1.February2017.[D22]DiSIEMConsortium.ReferenceArchitectureandIntegrationPlan.DiSIEMProjectDeliverable2.2.May2018.[D31]DiSIEMConsortium.SecurityMetricsandMeasurements.DiSIEMProjectDeliverable3.1.November2017.[D41] DiSIEM Consortium. Techniques and Tools for OSINT-based ThreatAnalysis.DiSIEMProjectDeliverable4.1.August2017.[D52]DiSIEMConsortium.Use casedemonstrators.DiSIEMProjectDeliverable5.2.August2018.[D61] DiSIEM Consortium. Preliminary Architecture and Service Model ofInfrastructureEnhancements.DiSIEMProjectDeliverable6.1.August2017.[D62]DiSIEMConsortium.Use casedemonstrators.DiSIEMProjectDeliverable6.2.August2018.

d7.1 validation plan - disiem-project.eudisiem-project.eu/wp-content/uploads/2018/11/d7.1.pdf ·...

Documents