d7.1 validation plan - disiem-project.eudisiem-project.eu/wp-content/uploads/2018/11/d7.1.pdf ·...
TRANSCRIPT
ProjectDeliverable
D7.1ValidationPlan
ProjectNumber 700692ProjectTitle DiSIEM–Diversity-enhancementsforSIEMsProgramme H2020-DS-04-2015Deliverabletype ReportDisseminationlevel PUSubmissiondate August31,2018(M24)Responsiblepartner EDP–EnergiasdePortugal,S.A.Editor PedroDiasRodriguesRevision 1.2
TheDiSIEMproject has received funding from theEuropeanUnion’sHorizon2020researchandinnovationprogrammeundergrantagreementNo700692.
D7.1
2
EditorPedroDiasRodrigues,EDPContributorsOlivierThonnard,AmadeusMiruna-MihaelaMironescu,AmadeusZayaniDabbabi,AmadeusGustavoGonzalez,AtosMarioFaiella,AtosCagatayTurkay,CityUniversityofLondonFrancesBuontempo,CityUniversityofLondonIlirGashi,CityUniversityofLondonPhongNgueyen,CityUniversityofLondonAbdullahiAdamu,DigitalMRLilianaRibeiro,EDPAdrianoSerckumecka,FCiências.IDAnaRespício,FCiências.IDPedroFerreira,FCiências.IDAlyssonBessani,FCiências.IDMichaelKamp,FraunhoferIAISSimingChen,FraunhoferIAISVersionHistoryVersion Author Description1.0 EDP Firstversionofthedocumentforinternalreview1.1 EDP Documentreleasedtothecoordinatorforquality
checkandsubmission1.2 FCiências.ID FinalminorimprovementsandsubmissiontoEC
D7.1
3
ExecutiveSummaryThe DiSIEM project has defined several goals with the common ambition ofextendingandimprovingexistingSIEMsystems,havingthefollowingprinciplesinmind: (1) nomodification on existing SIEM systems, (2) the exploitation ofexistingextensionports inSIEMplatforms, (3) theexistenceofnon-integratedcomponents,and(4)nosubstantialadditionalconfigurationshouldbeneeded.AreferencearchitectureforthedevelopmentswasestablishedindeliverableD2.2,whilethedescriptionandtechnicalinformationregardingeachcomponentwaslaidoutinthedeliverablesforworkpackages3-6.Thisreportcollectsanddescribesthedesignedscenariosanddevelopmentsmadeduringthefirst24monthsoftheproject,detailingtheirintegrationandvalidationplanusingthepartners’SIEMplatforms.Thecomponentswillbeintegratedwiththe SIEM systems following three different patterns – Event Generator, EventInspector,andEventCollector–thataresupportedbythethreeplatformsselectedforvalidatingtheprojectinnovations:MicroFocusArcSight(providedbyEDP),XL-SIEM(providedbyATOS)andElasticsearch(providedbyAmadeus).Adetaileddescriptionoftheplatformsisprovidedinthisreport,notingthecommonalitiesanddifferencesandhowtheyarerelevanttoensurethatDiSIEMdevelopmentsmaybeintegratedwithmultipleSIEMplatforms.The integration scope is defined in detail, using scenario-based development,consideringthetestbedsmadeavailablebytheindustrypartners.Theinformationpresented here will guide the activities for the last 12 months of the project,including the definition of success criteriawhen integrating the developmentsmadebytheconsortiumpartnersintheprovidedtestenvironments,leadingtopossibledeploymentsinpartnersproductionenvironments.
D7.1
4
TableofContents1 Introduction....................................................................................................................................71.1 ValidationMethodology..................................................................................................71.2 DiSIEMArchitecture.........................................................................................................81.3 SIEMIntegration.................................................................................................................91.4 OrganizationoftheDocument.....................................................................................9
2 DiSIEMComponents.................................................................................................................112.1 Multi-levelRiskManager(WP3)..............................................................................122.2 DiversityandForecastingAnalyticsPlatform(WP6/WP5)........................142.2.1 DiversityandForecastingAnalyticsEngine(WP6)..............................142.2.2 DiversityandForecastingAnalyticsDashboard(WP5)......................16
2.3 OSINTThreatDetector(WP4)...................................................................................172.4 Listening247ThreatAnalyser(WP4)...................................................................182.5 ThreatIntelligenceIntegrator(WP4)....................................................................202.6 Network-basedAnomalyDetector(WP6)...........................................................212.7 SkepticII(WP6)................................................................................................................232.8 UserBehaviourAnalyticsPlatform(WP5)..........................................................252.9 SLiCERCloudArchiver(WP6)...................................................................................26
3 SIEMEnvironments..................................................................................................................293.1 MicroFocusArcSight(EDP)........................................................................................293.2 XL-SIEM(ATOS)................................................................................................................303.3 ElasticStack(AMADEUS).............................................................................................333.4 EnvironmentValidation................................................................................................34
4 IntegrationPlan..........................................................................................................................364.1 IntegrationwithSIEM....................................................................................................364.1.1 Multi-levelRiskManager....................................................................................364.1.2 DiversityandForecastingAnalyticsPlatform..........................................364.1.3 ThreatIntelligenceIntegrator.........................................................................364.1.4 Diversity-enhancedMonitoring......................................................................374.1.5 SLiCERCloudArchiver........................................................................................37
4.2 ComponentIntegration.................................................................................................374.2.1 DiversityandForecastingAnalyticsPlatform..........................................384.2.2 OSINTDataFusionandAnalysisComponents........................................384.2.3 SkepticII.....................................................................................................................39
5 Evaluation......................................................................................................................................405.1 SuccessCriteria.................................................................................................................405.2 SBDValidationScenarios.............................................................................................415.3 EthicalandLegalAspects.............................................................................................45
6 ConclusionsandFutureWork.............................................................................................47References...............................................................................................................................................48
D7.1
5
ListofFiguresFigure1–DiSIEMArchitecture.......................................................................................................8Figure2–DiSIEMarchitectureandcomponents.................................................................11Figure3–Multi-levelRiskManagercomponent..................................................................13Figure4–Dataflowofforecastingengine..............................................................................16Figure5–OSINTthreatdetectorcomponentarchitecture.............................................17Figure6–ThreatIntelligenceIntegratorarchitecture......................................................20Figure7–Network-basedAnomalyDetectionarchitecture..........................................22Figure8–Skepticarchitecture......................................................................................................24Figure9–UserBehaviourAnalyticsPlatformdataflow..................................................25Figure10–SLiCERCloudArchiverarchitecture..................................................................27Figure11–Atostestbed...................................................................................................................32Figure12–DiSIEMcomponentinterfaces..............................................................................38Figure13SkepticIIintegrationwithUserBehaviourAnalyticsPlatform...............39
D7.1
6
ListofTablesTable1–OptionsforListening247ThreatAnalyserintegration.................................19Table2–Listening247ThreatAnalyserusecases..............................................................20Table3–Network-basedAnomalyDetectionmodules....................................................23Table4–Skepticmodules...............................................................................................................24Table5–ATOSTestbedcomponentsandrequirements.................................................33Table6–Amadeustestenvironment.........................................................................................34Table7–Amadeusproductionenvironment.........................................................................34Table8–Componentvalidationtestenvironments...........................................................35Table9–Evaluationsuccesscriteria.........................................................................................41Table10–Scenario1Claims..........................................................................................................43Table11–Scenario2Claims..........................................................................................................45
D7.1
7
1 Introduction
TheDiSIEMprojectaimstoimprovecurrentSecurityOperationsCentres(SOC)bydeveloping a set of extensions for SIEM (Security Information and EventManagement) systems. These components will be validated in three testenvironmentsprovidedbyEDP,Atos,andAmadeus.Afterthat,asubsetofthemwill be deployed on production environments provided by these partners forfurthervalidation.Buildinguponthedesignanddevelopmentactivitiescarriedoutduringthefirst24monthsof theDiSIEMproject, thisdeliverablegathers relevant informationregarding the implementation and validation phase. Based on the DiSIEMreferencearchitectureandconsideringtheSIEMoperationalenvironmentsfromthe partners, requirements and scenarios are defined in order to guide theintegrationofthedevelopedcomponents.We adopted a scenario-based development (SBD) methodology to drive theintegrationactivities,whichisaneffectivewaytoanalyseandcommunicateuserrequirements[Carroll2000].InthecontextofDiSIEMwewillfocusthescenariosontheSOCoperation,transformingtheday-to-dayactivitiesoftheSOCteamsinreal-lifescenariosforcomponentutilization.
1.1 Validation Methodology
By using SBD, we argue that the scenario description will be as accurate aspossibleandtheSOCteamsevaluatingthedevelopedcomponentswillrelatewiththescenarios,sincetheyarebasedontheirdailyactivitiesandneeds,addressingtheshortcomingsofexistingSIEMsystems.Ascenariomustincludethesettingorcontextdescription,actors,goals,actionsandevents.Additionally,eachscenariomustclearlyidentifytheprojectedpositiveandnegativeeffectsofitsapplication.The idea is that each scenario is easilyunderstandablenotonlyby theDiSIEMproject team,butalsobySOCusersandexternalstakeholderssuchas industrypersonnel, development teams and researchers, thus contributing to thedisseminationeffortsoftheproject.The methodology consists of three separate phases – Analysis, Design andEvaluation. In theanalysisphasetherequirementsareassessedusingmethodssuchasinterviewswithexperts,state-of-the-artstudyanddiscussiongroups.Thedesignphaseconsistsofdefiningsolutions,detailingtheirobjectives,methodstoachieve those objectives and user experience considerations. Lastly, theevaluationphaseconsiderstheusageofthecomponent,firstlyinaprototypeandlateronthe finalversion,usingthat feedback to improvethedevelopmentandeventuallyre-designingthecomponent.Anothercriticalstepisthetargetaudienceidentification.Aswementionedbefore,thedevelopedcomponentsmightbeusedbydifferent typesofend-users, fromSOCoperatorstoC-levelmanagers.Therefore,eachscenarioshouldbeevaluatedfromdifferentviewpointsinordertoachieveathoroughanalysisandavalidendproduct.
D7.1
8
In DiSIEMwe are considering 4 different points of view to assess integrationscenarios:
1. SOCoperators: end-users in theSOC teams, theprimaryusersof SIEMsystems;
2. SIEMplatformmanagers:usersresponsiblefortheSIEMplatform,whomust consider its technological evolutions and limitations, as well asassessingtheeffortrequiredtointegratethedevelopedcomponents;
3. Componentdevelopers:usersresponsibleforcomponentdevelopment,whose main concerns are the effort required in order to adapt thecomponent to different SIEM platforms. These developers can also giveprecious feedback regarding the applicability of the components,specifically if the integration outcome corresponded to the theoreticalexpectationscreatedduringthecomponentdescription;
4. C-levelmanagement:seniormanagersintheorganizationthatareabletoassess the relevance and quality of the developed component outputs,especially the ones related with risk management or other business-relevantKeyPerformanceIndicators(KPI).
We believe that taking advantage of these 4 different profiles will yieldmorerobustandusefulcomponentintegrationresults.
1.2 DiSIEM Architecture
Deliverable D2.2 [D22] included a description of the reference architecture ofDiSIEMand itsmaincomponents,whichremainedmostlyvalidandunchangedduring the component design and implementation phase (minormodificationswillbeexplained inthenextchapter).Wereviewthehigh-levelarchitecture inFigure1,builtaroundtheSIEMintegrationthatwillbevalidatedthroughoutWP7.
Figure1–DiSIEMArchitecture
DiSIEM Architecture
5/31/18 2
OSINT Data Sources
(internet)
Reporting
Correlation
Normalization
SIEM
EventStorage
Visualisationand Analysis Tools
OSINT Data Analysis and Fusion
Managed Infrastructure
Diversity-Enhanced Monitoring
Cloud-of-clouds Event Archival
Public Cloud Storage Services
another cloud
…
…
D7.1
9
It is important to recall that DiSIEM defined as a cornerstone that all theimprovements will be made around the basic SIEM functionality, withoutrequiring modifications on the base system. There are four fundamentalarchitecturalprinciplesinDiSIEM[D22].
1.3 SIEM Integration
Alldevelopedcomponentswillbeintegratedandevaluatedinoneormoreofthepartner’s SIEM platforms: Micro Focus ArcSight (EDP), XL-SIEM (ATOS) andElasticsearch (Amadeus).Asexplained indeliverableD2.1 [D21], thevarietyofSIEM platforms available and the fact that they represent some of the mostrelevant commercial and experimental SIEM platforms in the market is verysignificative.Weareconfidentthatsuccessfulvalidationoutcomesinthesefourplatformsprovesthattheinterfacesareflexibleandensuresthatthecomponentsmay be integratedwith other SIEM systems.The components are divided intothreetypes:
• Event Generator: components that generate events for the SIEM. Forinstance,newintrusiondetectorsystemsorOSINTanalysers,whichfeedtheSIEMwithnewinformation;
• Event Inspector: components that access the SIEMdatabase to inspectdata(e.g.,assetsandtheirproperties)andsubsetsofcollectedevents.Thisisespeciallyimportantforvisualisationandanalysistools;
• EventCollector:ThecomponentneedstohavefullaccesstoeventsintheSIEM. An example is any side-systems that do additional processing ofevents,beyondwhattheSIEMiscapableof.
1.4 Organization of the Document
Thischapterpresentedabriefintroductiontotheaimsofthisdocumentandsomebackgroundmaterial, preparing the reader for the rest of the report,which isorganizedas follows.Chapter2describes– inhigh level – the ten componentsdesignedand implemented intheDiSIEMproject,whileChapter3presents theSIEMenvironmentsinwhichthecomponentswillbetestedandintegratedwith.TheintegrationplanislaidoutinChapter4,includingalltherelevantinterfaces
Principle1:DiSIEMcomponentsmustbe integrated toSIEMsystemswithoutrequiringanymodificationsinthebasesystem.
Principle2:ExtensionscanbemadeeitherbyfeedingneweventstotheSIEM,bytaking info fromtheSIEMdatabase,and/orby creatingnewwaystovisualiseSIEMdata.
Principle3:Non-integratedcomponentscanbeusediftheyoperateside-by-sidewiththeSIEM,withoutreplicatinganyexistingfunctionality.
Principle4:NosignificantmanualworkshouldberequiredtosetupandoperateDiSIEMcomponents.
D7.1
10
amid components and between the components and the SIEM platforms. InChapter 5 we address all the validation activities and how effective were theplanned integrations. Concluding the document, Chapter 6 presents our finalremarksandplansforfuturework.
D7.1
11
2 DiSIEM Components
In this chapter, we present the components developed in the DiSIEM project.Figure 2 shows the updated project reference architecture with all thecomponentsandtheirrelationshipwithothercomponentsandtheSIEM.
Figure2–DiSIEMarchitectureandcomponents
Thefigureshowsninecomponents,dividedinfourgroups,inaccordancewiththetypeofenhancedprovidedbythecomponent.TheOSINTDataFusion&Analysisgroup contains the components related with the acquisition, processing, andintegrationofOSINTdata,mostlydeveloped inWP4.TheDiversity-enhancedMonitoringcomponentscomprisetheenhancedsensorsdevelopedintheproject,mostlyinWP6.AsforInfrastructureEnhancements,itisagroupwithasinglecomponent,alsodevelopedinWP6,whichimprovesthecapacityoftheSIEMtosecurelyarchiveeventsforlongtimeperiods.Thelastgroup,VisualisationandAnalysis Tools, contains the components for visual analysis and forecast ofcollectedSIEMdata.MostofthesecomponentsarecreatedinWP5,implementingalsotechniquesdevelopedinWP3.The architecture of Figure 2 updates the reference architecture described indeliverableD2.2inthreeways.First,thecomponentnameswerechangedtomakethem more mnemonic and relatable. Second, two components of theVisualisationandAnalysisToolsgroupthatcouldnotbeusedseparatelywere
SIEMReporting
CorrelationNormalization
EventStorage
Skeptic IINetwork-based Anomaly Detector
PublicCloud
Storage Services
OSINT Data Sources (Internet)
SLiCER Cloud Archiver
Listening 247Threat
Analyser
User Behaviour Analytics Platform
VisualizaNon and Analysis Tools
Diversity-enhanced Monitoring
OSINT Data Fusion & Analysis
OSINT Threat Detector
Infrastructure Enhancements
Threat Intelligence Integrator
Managed Infrastructure
raw eventsenriched eventsbulk data transfer
MulN-level Risk Manager
Diversity and Forecasting Analytics Platform
Diversity and Forecasting
Analytics Engine
Diversity and Forecasting
Analytics Dashboard
D7.1
12
merged in a newDiversity ForecastingAnalytics Platform component. Third, anew component (Multi-level Risk Management) was introduced in thearchitecture.Thiscomponentcorrespondstothematerializationofsomeresultsfromwork package 3 that could not be accurately implemented in any othercomponents.Allthecomponentsrepresentedinthisfigurewillbedescribedinthenextsections.
2.1 Multi-level Risk Manager (WP3)
TheMulti-levelRiskManager(MRM)componentispartofthevisualizationandanalysissetof tools inFigure2. It implements theMulti-levelRiskAssessment(MRA)modelproposedindeliverableD3.1[D31]andallowsonetointegratetheresults of the risk assessment with the SIEM. The risk assessment process ofDiSIEM is asset-oriented. The MRA model considers three levels of assets:services, softwareapplications, andhosts.Hosts supportsoftwareapplications,which can interact between each other, and services are composed by sets ofapplications. Themodel relies on three variables to compute the security riskscoreof eachasset: the riskassociatedwithvulnerabilities, the riskassociatedwithsecurityincidents,andtheriskinheritedfromdependenciesonotherassets.The MRM component architecture is presented in Figure 3. The central sub-componentoftheMRMisthemodel’sdatabase.Thisdatabaseispopulatedwiththeinformationon:
– thecompany’sassetsandthedependenciesbetweenthem,includingthemultiplelevelsdefinedinthemodel(services,softwareapplications,andhosts);
– the infrastructurevulnerabilities(discoveredbya third-partysoftware)andthevulnerabilitiesinthecompany’ssoftwareapplications(discoveredbypen-testingorothersoftwarevulnerabilitiesdiscoverytool);and
– thehistoryofrecentpastsecurityincidents.Topopulatethemodel’sdatabasewithallthisdata,aDBImportmodulemustbeimplemented. Similarly, the results of the risk assessment are exported to theSIEMbyaDBExportmodulethatextractstheinformationrelevantandfeedstheSIEM. The SIEM platform will then be programmed to differentiate alerts forevents involving assetswith higher risk scores to enhance itsmonitoring anddetectioncapability,thusimprovingriskmitigation.TheMRMinteractswiththedatabasetoupdatetheriskscoreoftheassets.Alltheinteraction between the users and the MRM is done through a dashboard,connectedtothedatabase,thatpresentstheriskassessmentresultsandwhereitispossibletoconfigurethemodel’sparameters.Thisdashboardwillprovidetheabilitytodrill-downintothedetailsofsecurityrisk,whilealsoenablinganalyticcapabilities to the SOC team and security managers. Decision-making will besupported by an integrated view of the infrastructure security risk, includinginformation about vulnerability severity and relevant incidents, coupled withinterdependenciesbetweenassetsforadditionalcontext.
D7.1
13
Figure3–Multi-levelRiskManagercomponent
On top of that, security risk analyticswill facilitate the generation of enrichedsecurityreportstosupportC-levelmanagers’decision-makingprocesses.To integrate theMRM componentwith a SIEM, the following sub-componentsmustbecustomised:
– TheDBImportmodule,whichshouldbeabletotransformtheinformationrelated with assets and the supported network, software andinfrastructure vulnerabilities, and past incidents, to the database datamodel;
– TheDBExportmoduletouploadthemodelinformationintotheSIEM.TheSIEMintegrationisflexibletoalloweasydataexchangeregardlessoftheSIEM technology inplace, supportingbothpull andpushmethodsusingstandardformats.
TheparametersoftheMRAmodelcanbefine-tunedinordertobealignedwiththecriteriafortheriskassessmentprocessesinthecompany.Thisadaptabilityisvital for the model, since a rigid approach could not fit the needs of entirelydifferent organizations, operating in various contexts and with a diverse risktolerance/appetite. The ability to customize the MRA model parameters willenable each company to address their own organizational needs, making theresultsrelevantforboththeoperationalandmanagementteams.Aftertheintegrationprocessiscomplete,therulesintheSIEMcorrelationengineshould be parameterized to establish the level of risk considered relevant toproducealertsintheSOCoperationcontext.TheoutputsoftheMRAmodelarethenponderedasanadditionalsourceofinformationthatenablesSOCoperatorstofocusonthemostcriticalassets,whilealsohavingabetterunderstandingoftherisk associated with each asset. As the risk level is an inclusive concept,understood at the different organizational levels, from technical teams to
D7.1
14
executive boardmembers, the integrationof theMRMcomponentoutputswillallow for business value perceptions to reach and influence the day-to-dayactivitiesofoperationsteams.At theoperational level, theeffectivenessof theMRMcomponent translates inaccurately assessing the security risk of company’s assets and consequentlyimproving thequalityof SIEMmonitoringanddetection.Theevaluationof theMRMcomponentefficacyrequiresitscontinuoususageduringacertainperiodoftime.Throughoutthatperiod,oneshouldcomputethenumberofincidentsthatwere detected by the SIEMusing solely the information provided by theMRMcomponent.Boththeutilityandusabilityofthecomponent,especiallywithregardto analytic capabilities, should also be evaluated by SOC analysts and SOCmanagersbymeansofquestionnaires.Atthetacticallevel,C-levelmanagerswillevaluatetheenhancementonthequalityof information concerning the security of services and software applications.QuestionnairestargetingC-levelmanagersshouldbeusedtoassessifsecurityriskawarenessisimproved.
2.2 Diversity and Forecasting Analytics Platform (WP6/WP5)
TheDiversityandForecastingAnalyticsPlatformperformsthediversityanalysisontheeventdataprovidedbyaSIEM,makesthisdataavailableforexplorativeanalysis, implements forecastingmodels andmakes thesemodels available asinteractive modelling elements. The component is developed as two tightlyrelatedsub-componentsdevelopedindifferentworkpackages,theDiversityandForecastingAnalyticsEnginedescribedin[D62]andtheDiversityandForecastingAnalyticsDashboarddescribedin[D52].
2.2.1 Diversity and Forecasting Analytics Engine (WP6)
TheDiversityandForecastingAnalyticsEngineconsumesdatafromaSIEMandusesthatdataforalertfilteringcapabilities,answeringquestionssuchas:
• HowmanyIDSplatformsalertedonthesameevent?• HowmanyeventswereflaggedbyallIDSplatforms?• HowmanyeventsweremissedbyallIDSplatforms?
Moregenerally,theenginecanspecifyhowmanyeventswereflaggedbykoutofthe n tools, for any k between 0 and the total number of tools, n. The samequestionscanbeappliedtoanytoolthatisaneventsourcefortheSIEM,includingfirewalls,antivirussoftware,IDSplatforms,etc.Inourinitialstate-of-the-artstudy,wedeterminedthatsuchanalysisispossiblein some SIEMplatforms, but not all. For example, no SIEM platform currentlyreportseventsmissedbythemonitoringtoolsandnonesuggestsbest/optimalcombination of diverse sensors. Moreover, the alert filtering featureimplementationisalsoSIEMdependent.ThediversitycountinElasticprocessestheinputdatajoiningitintooneindexcreatingavotefield,sowecancountthe“vote”directly(e.g.,7outof9).Anexamplescripttouploaddatainthisformatwillbeprovidedinthefinaldeliverablesoftheproject.
D7.1
15
The results can then be shown in a Kibana dashboard and, for XL-SIEM, theexistingfilterscanbeuseddirectly.Thealertfilteringusecaseallowsinteractivevisualisationsandstatisticsover time, takingeventdataas inputsandoutputscountssuchas1outofn,orkoutofn.ItsusevariesbetweenSIEMplatforms.Asdescribed,ascriptshowshowtojoinorfusethedataforElasticSearch,sothevotecanbefoundeasily.ForSIEMplatformsusingarelationaldatabase,suchasXL-SIEM,thesameresultswillbeachievedviatheirexistingjoinqueries.Theenginealsoenablesthelabellingofspecificeventsofinterest,includingfalsepositives – events flagged as security risks, which in fact were not – and fitsmathematicalmodelsaccordingtoeventtimestamps,aidingpredictionoftimestofutureevents.Theevent labels canbe storeddirectly in someSIEMplatforms,includingElasticandXL-SIEM,whileothersmayrequireadditionalstorage.OncetheeventscollectedbyaSIEMhavebeenlabelled,thealertfilteringisenhancedby enabling counts of true or false positives or negatives, measurement ofaccuracy, sensitivity, specificity and, finally, it can suggest the optimal diverseconfiguration.EventlabellingisnotdirectlyavailableinanySIEMweknowof.Combinedwiththealertfiltering,labelleddataprovidestheSIEMwithenhancedassessmentability.AprototypeGUItomarkspecificeventshasbeendevelopedforElastic.MoregeneralscriptstoflagupspecificrecordsbythespecificsourceIPaddress,etc.canbedevelopedandanopensourcelibraryDejaVu1canbeusedtoupdateElasticdocuments.SIEMsystemsbackedbyarelationaldatabasecanutilisetheprototypeGraphicalUserInterface(GUI)onceappropriatecreate,read,update,delete(CRUD)querieshavebeenwritten.The labellingusecaseallowsmarking of incorrectly classified events as false positive (an alert fired whennothingbadhadhappened)orfalsenegative(somethingbadhappenedthatthetoolsdidn’tspot).IttakesunlabelledeventsstoredinaSIEMandstoresthelabelsdirectlyintheSIEMorinadditionalstorage.TheoperationalstepsvarybetweenSIEMsystems,thoughtheyeitherinvolveusingafrontendorrunningscripts.The forecasting feature aims topredict the timeuntil thenext specific eventofinterest, including false negative, false positive or specific type of attack. Suchmodelsarenotwidelyavailable.R2doeshavesomeofthemodelsweprovide,butnotall. Furthermore,wehaveexpertknowledgeandprovidenovelpost-fittingproceduresanalysisofpredictivequalityandrecalibrationtoimprovepredictionaccuracy.ThemodelstakeeventsofinterestfromaSIEM,labelledwhererequired,andrunasstandaloneexecutables.TheforecastingresultsaremadeavailableasJSONandcanbeexportedinsyslogformattoo.ThisallowstheresultstobesentintoaSIEMfordisplayorforwardedtotheDiversityandForecastingAnalyticsdashboard developed inWP5 and described in Section 2.2.2. In principle, theEnginecanalsobeusedwithouttheDashboard(forexample,anindustrialpartnermaycalltheEngine’sAPIdirectlyfromtheirSIEMsystem),soitcanbeusedasaself-contained component. The display can also include details to assess the
1https://github.com/appbaseio/dejavu2https://www.r-project.org
D7.1
16
predictiveaccuracy,usingReliabilityGrowthModels(RGM).TheprocessisshowninFigure4.
Figure4–Dataflowofforecastingengine
Theforecastingusecaseenablesanadvancedwarningofpotentialproblems,butalsotheassessmenttoseeifthenumberofrelevantfailuresinthreatdetection(e.g.,failureofanIDS,ortwoIDSplatformsworkingonadiverseconfiguration)isdecreasingovertime.Themodelsuseeventtimestampsasinputandtheyoutputthepredictedtimeuntilthenextevent,accordingtovariousmodels,togetherwiththeuncertaintyassociatedwiththatprediction.
2.2.2 Diversity and Forecasting Analytics Dashboard (WP5)
Theoutputsoftheenginecanbeanalysedusingadashboard,asdescribedinthissection, forming the Diversity and Forecasting Analytics Platform, a singlecomponent that aims to provide insight into the diverse configurations of themonitoring tools and theirpredictive capabilities.Theultimategoal is tobuildsolutionsforSOCoperatorsandcybersecurityanalysts,helpingthemmakebetterinformeddecisionsduringboththedesignofadefencestrategyandduringtheevaluationofongoingattacksandanomalies.Intermsofpipeline,thedashboardelementreceivesdatafromtheSIEM,invokestheengineelement to applymodelling, visualises themodeloutput, andoffersinteractivecapabilityfortheanalyststofilterthedataandrefinethemodels.Morespecifically,threemainstagesoftheanalysisaredescribedbelow.Stage1–OverviewofalertsThealertdatacomesfromtheSIEMand,atthisstage,noinformationonwhetherthealertsarerealattacksornotisyetavailable.Duetothisfact,thenatureoftheanalysisispurelyexploratoryatthisstageandhasthegoalofgaininganoverviewofhowthealertsaredistributedovertimeandoverdifferentconfigurations,andtoidentifytrendsandoutliers.Thedashboardaimsto:
- DisplaytheoveralldistributionofalertsbrokendownbytimeandotherinformativeattributessuchasprotocolsandIPaddresses;
- Filterandfocusonaparticulartimeperiodorattributevalue.
Stage2–InteractiveexplorationofdiverseconfigurationsThedashboardinvokestheenginetogetlabelledalerts,i.e.,dataonwhetherthealertisassociatedwitharealattackornot.Thedashboardenablesanalyststo:
- Manuallyadjustwhattheperformancemetricsare;
8/31/18 2
Dashboard: Extract data
from SIEM and Generate inter-
event times
Formatdata for input to the RGM
Run
RGM
Generate Outputs from the
RGM
JSON or syslog
outputs to dashboard
or SIEM
D7.1
17
- Filterthealertstofocusonaparticularsubset(e.g.,onlyfalsepositives);- Filtertimeperiods;- Observe changes over time and/or performance during a particular
instance,suchasanattack.Stage3–AnalysisandevaluationofmodelensemblesThisstageoftheinvestigationinvolvesacombinationoftheanalysismodellingoutputs with the aim of evaluating the forecasts for future potentialvulnerabilities.Thedashboardvisualisesthemodeloutputfromtheengineandenablesanalyststo:
- Visuallyinvestigateseveralmodelswiththeirforecastsinasynopticway;- Visualisetheuncertaintyinthepredictions;- Relate the predictedmodels to past rawdata to provide context to the
predictions.
2.3 OSINT Threat Detector (WP4)
TheOSINTThreatDetector(OTD)component,asdefinedin[D41],ispartoftheOSINT data fusion and analysis group shown in Figure 2. It is designed as asubscription-basedservicewhosepurposeistoprovideorganizationswithtimelyawareness on the threat landscape regardingassets in their IT Infrastructures(ITI)of interest,as identifiedbytheorganization.EachsubscriberorganizationmaydefinemultipleITIforwhichtheOTDwillstartcollectinginformationfromTwitterandraisingalarmswithIndicatorsofCompromise(IoC)relatedtoexistingandemergingthreats.TheOTDcomponentarchitectureispresentedinFigure5.Therearefourhigh-levelsub-systemsinthecomponent:atweetcollector;atweetprocessingunit;IoCcommunicationchannels;andawebserver.
Figure5–OSINTthreatdetectorcomponentarchitecture
The tweet collector is responsible for gathering all the tweets from selectedgroups of accounts. The tweets will be filtered, classified and aggregated(clustered) in theprocessingunit, inordertoformconsistentgroupsof tweets
D7.1
18
thatwillbefurtherprocessedandconvertedintoanIoCformat.Inturn,theIoCdiscoveredwillbesenttotheSOCbyoneormorechannels:twitter;email;MISP;and SIEM. A web server will provide analysis dashboards and a managementinterface.Regardless of the channel throughwhich the analysts receive IoC providing asummaryofthethreat,thesewillcontainahyperlinktotheanalysisdashboards.Thedashboardswillprovidecompleteinformationregardingthethreatlandscapeidentifiedbythesystem:
• whichITIareaffectedand,foreachITI,whichassetsareaffected;• whichthreatsarecurrentlyactive,andwhichattractmoreattention;• accessallthetweetsrelatedtoanIoC;• analysehistoricalIoCdata.
Themanagementinterfacehastwomainfunctions:managingusercredentialsandprofiles;andmanagingtheITIdetailsandcharacteristics,whichincludescreatingordeletingboththeITIandassetsbelongingtoit.Twoprofilesareenvisagedtoaccess the component web interface: service managers identified by theorganization,andSOCoperators.Besidesaccessingtheanalysisdashboards,theservicemanagerswillbeallowedtomanagetheITI.SOCoperatorswillonlyhaveaccesstotheanalysisdashboards.BesidesintegratingwiththeSIEMplatformsandMISP,thecomponentusesemailandTwittertotargetanalystsdirectlyinordertogettheirimmediateattention.ByintegratingwithMISP,thecomponentmayconnecttothreatsharingplatformsoranyothersystemsthatuseMISP. IntegrationwiththeSIEMcanbeachievedthroughMISPordirectlyusingSIEMconnectors.Consideringthat threatsaredisclosedandconfirmedsometimeafter theystartaffectinganITI,theevaluationofthecomponentefficacyandefficiencycanonlybe conclusive after the threats are confirmed, thereforewith some delay. TheefficacyoftheOTDcomponentisrelatedtotheabilityofdiscoveringthreatsthataffecttheITIsdefined.ThisentailsfindingallthethreatsthataffectedanITIduringacertainperiodoftimeandcomputethepercentagethatwherediscoveredbytheOTD.Toevaluate theOTDefficiency in the considered timeperiod,wehave toassess the timelinesswithwhich threatswerediscovered, the relevanceof thethreats(impactontheITI),howtheywereprioritizedandtheactionabilitythatwasprovidedbythegeneratedIoC.
2.4 Listening 247 Threat Analyser (WP4)
The Listening247 Threat Analyser component aggregates information fromvariouspubliclyaccessiblesourcesontheWorldWideWeb,includingExploit-DB,NISTNVD(NISTNationalVulnerabilityDatabase),boards,news,blogs,Twitter,etc.ItalsointegratesinformationfromtheDarkWebandaggregatesitwithalltheother sources of information. All this information is aggregated according to aschema based on the NIST NVD, which avoids re-inventing the wheel whilstkeeping it familiar to the end users. The threat analyser consists of machinelearningmodels that predict threat severity (e.g., CVSS 2.0BaseMetric), threat
D7.1
19
type(e.g.,theCommonWeaknessEnumeration,orCWE,codethatclassifiesthethreatbymodusoperandi),andthetargetedplatform.Thismetadatageneratedfrom OSINT data by the threat analyser will give the SOC operators vitalinformationabouttheexposureoftheorganization'sinfrastructure,thenatureofexposure as well as the potential impact of these vulnerabilities if exploited.Together with other information, such as the timestamp, it provides vitalinformation that feeds theSOC teamwith information, allowing it toprioritizevulnerability analysis, and also quantify their degree of exposure in terms ofnumberofthreatswithhigh,medium,andlowimpactfortheinfrastructure.TheListening247ThreatAnalyserreliesontheListening247platform3forOSINTdata. However, it also gives the flexibility of integrating it with other OSINTsourcesviatheRESTAPI.TheListening247platformgathersOSINTdatabasedonakeywordsearchquerythat,inthecaseofDiSIEM,arethenamesofinfrastructuresuchassoftwareandhardware.Itprovidestheoptiontochoosetherangeofdatestogatherdatafromandalsoselectingthesourcesofinterest(suchasblogs,videos,reviews,etc.)beforeharvestingdata.TheplatformallowstheSOCoperators toenriching the harvested data using the threat analysis models. The enrichedinformationcanthenbefetcheddirectlyfromourElasticsearchdatabaseviaitsRESTAPIwhere itcanbeusedasa feed for theSIEM.Table1summarisestheintegrationoptionsavailable.
Method Description
ElasticsearchRESTAPI
DirectlyaccessthedataenrichedbythethreatanalyserfromourElasticsearchdatabases.
MISPIOCMessages
GetOSINTdataintheformofMISPmessagestoyourMISPinstancefromeverytimenewdataisavailable.
Table1–OptionsforListening247ThreatAnalyserintegration
The informationresulting fromthethreatanalysercanbeused inrelevantusecases,asdescribedinTable2.
UseCase ExpectedOutcomes
Dashboard
ThedashboardallowstheSOCoperatorstohaveanoverviewofthedimensionsofexposuretheyhave.Thisdashboardcanbeusedtodeterminewhichvulnerabilitiesshouldbeaddressedfirstbyexploringthehighestseveritythreatsforthecriticalinfrastructure.Thisdashboardcanalsobeusedtoexploretrendsintermsofthetypeofvulnerabilities,andthetargetedinfrastructuretomakedecisionaboutbudgetplanning(e.g.,procurementofmoresecurehardware).Thedashboardwillprovidethefollowingcapabilities:
3https://www.digital-mr.com/solutions/social-media-listening
D7.1
20
• InteractivelyexplorethetimeseriesofOSINTdatatoseethedistributionofHIGH,MEDIUMandLOWseveritythreatsandalsoseecountforeachofthoseclassesofthreatseverities;
• AbilitytointeractivelyexploretheCWEtop10threattypes;
• Abilitytointeractivelyexplorethetop10affectedplatformsorsystemsintheinfrastructure;
• Abilitytointeractivelyexplorespecificsourcesofinformation,suchasListening247,Exploit-DB,NIST,andDarknet;
• AbilitytoreadtheOSINTdata’sactualtext.
MISPAlerts/JSONMessages
TheMISPalertscanbeusedfordirectintegrationwiththeSIEMplatforms,enablingthereal-timealertcapabilitiesforOSINTdata.Specifically,theSIEMwillbeabletoalerttheSOCoperatorsusingexternalevents,suchasatweetaboutarelevantvulnerabilitythatisexploitedusingremoteconnections.Itcanalsocontributetotheriskscorecalculations,usingthevulnerabilityimpactclassification.
Table2–Listening247ThreatAnalyserusecases
2.5 Threat Intelligence Integrator (WP4)
ThescopeoftheThreatIntelligenceIntegratoristocorrelatebothstaticandreal-time information, such as Indicators of Compromise (IoC), associated to themonitoredinfrastructure,withdatacomingfromexternalOSINTdatafusionandanalysistools,toinferifthelattercouldrepresentrealintelligencefortheSIEM.The result of this processwill be the generation of a Threat Score, computedconsideringvariouscriteriarelatedtothreatintelligencedomain(e.g.,relevance,accuracy,timeliness,variety,completeness).ThisadditionaldatawillbeusedforenrichingtheincomingOSINTinformation,beforesendingittotheSIEM,allowingtheprioritizationofreceivedcybersecurityevents.
Figure6–ThreatIntelligenceIntegratorarchitecture
D7.1
21
Thearchitecture,depicted inFigure6, is composedof twomainmodules: (i) aMISP Instance, used to gather data from OSINT-based components and themonitoredinfrastructures.TheoutputwillbeanenrichedIoCsentdirectlytotheSIEMsystem;and(ii)theHeuristicModule,inchargeofcomputingathreatscore,throughaheuristicanalysis,enrichingtheoriginaldatacomingfromOSINTdatafusionandanalysistools,andstoredintheMISPinstance.TheintegrationbetweentheSIEMandtheThreatIntelligenceIntegratorwillrelyon MISP. The objective is to use, as much as possible, the built-in sharingcapabilitiesoftheplatformwhenthisinteractiontakesplace,suchasthezeroMQ4publish-subscribemodel.However,MISPcomeswithso-called“MISP-modules”,usedboth forad-hoc importandexportof threat information,whichrepresentanotherpossiblesolution.Moreover,newmodulescouldbecreatedfromscratchandintegratedwithitwithoutmodifyingthecorefunctionalitiesoftheplatform,ifneeded.ThedetailsforSIEMintegrationaredescribedinChapter4.The finalaimof thiscomponent is toenrichthe incomingIoC fromOSINTdatafusion and analysis toolswith a threat score,which represents the result of aheuristic analysis, performed correlating this data with static and dynamicinformation detected by themonitored infrastructure. The refined IoCwill be,then, sent to the related SIEM, allowing SOC operators to prioritize incomingevents.Thisrepresentsalsotheexpectedoutcomeofthefinalintegration.Thecomponentwillbeevaluatedconsideringvaliddatasetsfromeachmonitoredinfrastructure,usedforperformingthecorrelationwithrealIoCcomingfromtheOSINT data fusion and analysis tools. However, for making the first tests,simplifieddatasetscouldalsobeused,forevaluatingthefirstresultsgivenbytheHeuristicEngine.
2.6 Network-based Anomaly Detector (WP6)
The Network-based Anomaly Detector component monitors and assessesnetwork traffic, in real time, by using machine learning algorithms to detectanomalous behaviours in the monitored applications. It considers severalconditions, like the number of connections between different servers, theconnectionsandtrafficbetweentheportsusedbytheapplications,etc.todefinetheboundariesof legitimate traffic andmake thepredictionsaboutanomalousandpotentiallymalicioustraffic.TheoutcomeoftheintegrationwithaSIEMisthetimelydetectionofanomaliesorsignificant deviation in the behaviour of studied applications (e.g., Gitlab andOwnCloud),makingitpossibletodetectmaliciousactivitiesbeforetheygenerateasecurityincident.Thisintegrationwillbeaccomplishedthroughatextfilewiththeresultsfromthemachinelearninganalysis.SuchfileismadeavailablefortheSIEM,sothatitcanbe read and processed,making itpossible to show in the SIEMdashboard the
4http://zeromq.org/
D7.1
22
comparisonofthestudiedmachinelearningalgorithmsintermsofperformance,time,accuracy,andsuccessfulpredictions.Forevaluationpurposes,themachinelearningalgorithmsmustbetestedagainsta valid dataset, containing both legitimate and malicious traffic. During thetrainingandanalysisprocess,onlylegitimatetrafficmustbeconsidered.Whereasduringthepredictionprocess,both legitimateandanomalous traffic isused toevaluate the algorithms based on parameters such as accuracy, precision, andrecall.
Figure7–Network-basedAnomalyDetectionarchitecture
Figure7depictsthedifferentmodulesoftheNetwork-basedAnomalyDetectioncomponentand its interactionwithboth theSIEMplatformand themonitoredapplications. Table 3 lists each module and their roles. More details on thecomponentanditsmodulescanbefoundin[D61].
D7.1
23
Module Role
TrafficHeaderCaptureDatasetinstancestobeanalysedduringthetrainingandpredictionprocess.
EntryPoint
WebservicewherethetrafficcomingfromthemonitoredapplicationsisidentifiedbytheirIPaddressesandportnumbers
Configuration Allows interaction between theapplicationandtheenduser.
TrainingandPredictionUsesmachinelearningtotrainthemodelandmakepredictionsoverthecapturedata
Database(DB)Storesthetrafficcaptureandtheoutputofthetrainingandpredictionprocess.
PredictionServiceProvidestheoutputofthecomponentandinteractswithSIEMsandvisualizationtools
Table3–Network-basedAnomalyDetectionmodules
2.7 Skeptic II (WP6)
The Skeptic extension is an application-based anomaly detector that aims toenhance application security by monitoring user activities. The system uses ahybrid detection approach that combinesUBA (UserBehaviourAnalytics) andRule-basedanomalydetection.WhiletheUBAapproachreliesonmodellinguseractivitypatternsanddetectingactivitiesthatdeviatefromthelearntmodels,theRule-based approach uses rules defined by application experts to detectanomalies. Combining both approaches allows this component to improvedetectionaccuracy.SkeptichasamodularandflexibledesignthatallowsthecomponenttomonitordifferentapplicationusecasesandintegratewithdifferentSIEMplatforms.Figure8depictsthedifferentSkepticmodulesanditsinteractionwiththeSIEMplatformand the applicationmonitored, as described in [D22]. Table 4 lists the Skepticcomponentsandtheirroles.TheSkepticcomponentisanEventGeneratorbasedontheApacheSpark(version2.1.0) and theApacheHadoop (version 2.7.0) frameworks. For the integrationwithSIEMplatformsandotherDiSIEMextensions,SkepticsupportsJSON,LEEFandSYSLOGdataformats.TheSkeptichardwarerequirementsdependonthevolumeofeventsgeneratedbytheapplicationtomonitor.Thehardwareresourcestoallocatewillbemeasuredand fine-tuned during the integration phase. As an example, ongoingimplementations of Skeptic to detect malicious scraping activities on a webapplication with 5 transactions per second show that, when running in batchmode,Skeptictakesaround10minutestoanalyseallHTTPtrafficforaday.
D7.1
24
Figure8–Skepticarchitecture
Module Role
InputIngestApplicationLogsandOSINTRelevantThreatIntelligencedatafeeds
Configuration DefineUBAmodels,Rules,Scoring,Aggregation
Aggregation AggregateandnormalizeapplicationandOSINTdata
BehaviouralEngineTraindetectionUBAmodelsanddetectanomalousapplicationuseractivities
RuleEngine Load/Applydefineddetectionrules
Output AggregatedetectionresultsandwriteapplicationsessionstotheSIEM
Table4–Skepticmodules
D7.1
25
2.8 User Behaviour Analytics Platform (WP5)
The user behaviour analytics platform is an integrated system of three majorDiSIEMcontributions, theVisualUserBehaviourAnalysis (namedU4-VASABI),the User BehaviourModelling, and the TopicModellingBased User BehaviourExtractionandClustering.TheplatformreadsuseractionsorganizedinsessionsdirectlyfromtheSIEMdata,processesitandprovidestoolsforSIEMoperatorstoanalyse this behaviour in detail. The module organization and data flows arerepresentedinFigure9.
Figure9–UserBehaviourAnalyticsPlatformdataflow.
The Topic Modelling Based User Behaviour Extraction and Clustering moduleenables analysts to interactively understand user behaviour clusters based onensembles of topic modelling results. Taking advantage of the LDA (LatentDirichletAllocation)processfromtextmining,wemapthesessionsasdocumentsandeachactioninasessionasaword.Thus,wecangeneratethetopic,i.e.,theprobability clustered results of user behaviours.We provide a visual interfaceshowingthedistributionofinitialbehaviourclusters,thedistributionofactionsineachcluster,andtheoverlapamongselectedclusters.Analystscanunderstandthebehaviour and interactively select suitable clusters by observing howrepresentativetheyareandhowmuchoverlapexistamongthem.Throughsuchan iterativemanner, the analysts are able to better understand the behaviour,generate the behaviour clusters, and then further improve the user behaviourmodelling. The cluster-based behavioural model provides scores for userbehaviour based on predictions using state-of-the-art machine learningtechniques.Thesescoresindicatetheanomalyofauseractionwithrespectto(i)thegeneralbehaviourofusersinthesystem,(ii)thepastactionsofthatspecificuser,and(iii)thebehaviourofsimilarusers.ThesescoresareusedtoannotateactionswithintheU4-VASABImodule.The User Behaviour Modelling module uses long-short-term-memory (LSTM)recurrentneuralnetworkstrainedonuseractionsandthetimingoftheseactions.
D7.1
26
TheLSTMprovidesaprobabilitydistributionoverfutureactionsofthatusergivenher currentbehaviour.Contrasting theseprobabilities to theactuallyobservedactions allows one to deduce an anomaly score of the observed actions. ThisanomalyscorecanbeusedtoflaguserbehaviourdirectlyforaSIEMoperatorortouseitforanin-depthanalysisofauser’sbehaviour.Inordertoobtainmodelsforusersthatbehavesimilarly,aclusteringofusersessionisobtainedfromtheTopicModellingBasedUserBehaviourExtractionandClusteringmodule.TheU4-VASABImodule is thepartof theplatformwherethe interactivevisualanalysisof individual sessionsand theusersarevisually investigated in-depth.Thesesolutionsaredesignedtoaddressanumberofhigh-levelgoalsandanalysistasksthatarecarriedoutbySOCoperatorsandexpertswhomakedecisionsonwhether particular sessions are fraudulent or whether particular users arebehavinginanunusualmanner.TheelementsinU4-VASABIprovide(i)amulti-perspective exploration of sessions through the extraction and exploration ofsession-level metrics and the temporal distribution, (ii) a high-level semanticsummaries of action sequences through the extraction of frequent activitiesthroughpatternminingandthevisualsummariesof theminedpatterns, (iii)amulti-levelvisualisationapproachtoinvestigatemultiplesessionsrevealingtheiractivity type and temporal distributions, (iv) a comparative analysis capabilitythrough theutilisationof expectedness scores, (v) avisualuserprofile tobothsummarise the historical behavioural profile of users and to put individualsessionsincontext,and(vi)aclusterbasedsummarisationofactionstoidentifyunderlying tasks. To be able to deliver these functionalities, the U4-VASABIcomponent involves a combination of computational methods and severalinteractive views. The views are coordinatedwithin each other using a novel,multi-semantic linking approach and are provided as standalone web-basedinterfaces.All the componentswithin theUserBehaviourAnalyticsplatformaredesignedandbuiltasplatformindependent,self-contained,genericsolutionsthatcanworkwithanytime-stampedsequencesoftypedeventsthatareorganisedintosessionsand executed by identifiable agents (i.e., users or automated processes/agentswithagivenID).Allthecomponentsoperateonsuchsession-leveldataandarecapableofaccessingsuchdataeitherasstaticfilesorthroughcallstoadatastore,such as Elasticsearch or any other database. The component is also capable ofwritingadditional/computedinformationtothesedatastores.Thisflexibilitywillenablethecomponenttobeabletoworkbothindependentlyandin-coordinationwithanySIEMthatusesstandarddatastoressuchasElasticsearch.
2.9 SLiCER Cloud Archiver (WP6)
The purpose of the SLiCERCloudArchiver component is to provide long-termstorage foreventsgeneratedbysensors in themonitored infrastructure.TheseeventsaresenttoSIEM,buttheystayinthesystemforalimitedamountoftimeduetostorageconstraints(generally,lessthan6months).Thiscomponentaimsto use cloud storage services such as Amazon S35and Azure Blob6Storage to5https://aws.amazon.com/s36https://azure.microsoft.com/en-us/services/storage/blobs
D7.1
27
securelyarchivetheseeventsforalongerperiod.Increasingtheretentionperiodisimportantbecausemanythreats,vulnerabilities,andzero-daysarediscoveredlongaftertheyentertheinfrastructure.Somezero-dayvulnerabilitiestake3yearsormoretobedisclosed[AblonandBogart,2017].Anotherimportantreasonistosupportlong-termforensicanalysis,whichrequiresaccesstoeventscollectedatthetimetheincidentoccurred.Thecomponentorganizesandstorestheeventsinacloud-of-clouds[Bessanietal. 2013,Oliveiraet al.2016], ensuring security, cloud fault tolerance, andcostefficiency.Moreconcretely,thecomponentwasbuilttoworkontopoftheVAWLTmulti-cloud storage service, 7 being developed by the FCiências.ID team. ThecloudsusedforstoringtheeventsshouldbecompliantwiththeEUGeneralDataProtection Regulation (GDPR)8and any additional legislation applicable to thepersonalorsensitivedata. In theend, theobjective is toeliminatetheneedforSIEMplatformstohavealocalarchivalinfrastructure.Figure10illustrateshowtheproposedcomponentinteractswithexistingSIEMsystems.TheSensors (S)generateeventsand send them toConnectors,whichnormalizeandforwardthemtoSIEM.TheeventsmaybeuploadeddirectlyfromtheConnectorstotheSLiCER(redarrows)atthesametimethattheyarealsosenttotheSIEM.However,thisrequirestheConnectorstohavedirectInternetaccess,whichmightnotbepossibleordesiredintheinfrastructure.ThesecondoptionistohaveonlytheSIEMuploadingtheinformationtoSLiCER(bluearrow).SLiCERreceivestheevents,groupsthemintimeintervals(e.g.,anhour),andcreatesaneventblockthatiswrittentothecloudsusingtheVAWLTservice.Tominimizethecostsofusingcloudstorageservices,itisimportantthatthecomponentorganizesthe events in such a way that typical queries on the event archive avoiddownloading high volumes of data. This requirement comes from the fact thatdownload bandwidth is themost expensive resource on typical cloud storageservices. Therefore,we employ a strategy to efficiently divide event types intoblockstominimizereadsfromthecloudandmaximizetheefficiencyofclouddataretrieval.Thisisdonebygeneratingsmallindexfilesfortheblocksconsideringeventpropertiesnormallyusedforperformingsearches.
Figure10–SLiCERCloudArchiverarchitecture
7https://vawlt.io8https://ec.europa.eu/info/law/law-topic/data-protection_en
D7.1
28
ArchivalqueriesareexecutedthroughaGUIprovidedbythecomponent.Readeventblocksarecachedtobereadlocally,assuchqueriesareusuallypartofagroupofmanyoperations,mostcommonlyusedduringforensicsanalysis.MoredetailsaboutthiscomponentcanbefoundindeliverableD6.1[D61].
D7.1
29
3 SIEM Environments
This chapter describes the SIEM environments of each partner, where thedevelopedcomponentswillbeintegrated.WeaddresstheoperationalcontextofeachSIEMsinceaproductionenvironmentwithrealdataandday-to-dayusagehasradicallydifferentimplementationrequirementsthanatestenvironment.Thetechnicallimitationsofeachplatformareclearlystated,aswellasthestrategytoovercomethoselimitationsandsuccessfullyintegratethecomponents.
3.1 Micro Focus ArcSight (EDP)
TheMicroFocusArcSight,formerlyHPArcSight,istheSIEMsystemusedatEDPbyitsSOCteam.ItisthecoretechnicalsystemaroundwhichtheSOCoperationisorganisedandalsothemainsecurityeventrepositoryforEDP.TheArcSightSIEMplatformdeployedatEDPincludesthefullSIEMstack,consistingofthreemajorcomponentsofitsarchitecture:Connectors,Loggers,andtheEnterpriseSecurityManagement(ESM).TheESMcomponent,alsoknownastheconsoleorcorrelationengine,providesagraphical interface for the SOC team. This module is responsible for eventcorrelation,processingreal-timerulesandtriggeringalerts.FromthisinterfacetheSOCteammonitors,analyses,anddealswithcollecteddata,creatingcustomfilters and rules. All the real-time events are constantly being updated andpresentedindashboards,withseveralvisualizationoptions,suchascharts,tablesornetworkgraphs.The connectors are software components which collect events from sources,eitherthroughpullorpushmethods,normalizingthemintoastandardformat–CommonEventFormat(CEF).Theconnectorsarealsoresponsible for filtering,aggregatingandcategorizingtheevents,whicharethensent, inparallel, to theLoggerandESMcomponents.Thispre-processingiscrucialtosavebandwidthandimprovetheefficiencyoftheeventcollectionprocess.TheArcSightLoggerisauniversallogmanagementsolutionwiththepurposeofoptimization. The extreme high event throughput, efficiency in the long-termstorage and the rapid data analysis are themain benefits of using the Logger.Moreover,theLoggerservesasasecureeventstorageforforensicanalysisand,ifneedbe,topresenttoauthoritiesincaseoflegalprocedures.EDPmaintainstwoSIEMinstances:theproductionenvironmentonwhichtheSOCoperatorsworkandthetestenvironment,wherenewconnectorsandcomponentsareevaluatedandassessedbeforebeingdeployedtotheproductionenvironment.Bothenvironmentsarebasedonexactlythesameproductandversion,makingitpossible to have a meaningful testbed. The differences between the test andproduction environment are the number of connectors and the computationalresourcesavailable,bothofwhicharegreaterintheproductionenvironment.AllcomponentintegrationsinthecourseoftheDiSIEMprojectwillbeaddressedfirst in the test environment and, if the tests are successful, theywill then bedeployed in the production environment. The criteria to determinewhether a
D7.1
30
component is allowed to be implemented in the production environment isdividedintofunctional,technicalandproceduralvectors.Firstandforemost,thedeveloped components must be aligned with EDP security policies, legal andregulatory obligations and technical specifications. The developmentsmust bevalidatedbyEDP’ssecurityteam,includingavulnerabilityassessmentandloadtestsforqualityassurance.Lastly,theintroductionofthenewcomponentsmustnothaveanegativeimpactontheSOCoperation,followingthebasicprinciplesofDiSIEM that state minimal operational and integration overhead as the mostrelevantdriversfortheproject.
3.2 XL-SIEM (ATOS)
TheCross-LayerSecurityInformationandEventManagementtool(XL-SIEM)isanenhancedsecuritydataanalyticsplatformwithahigh-performancecorrelationengine able to raise alarms from a business perspective, considering eventscollectedatdifferentabstractionlayers.TheXL-SIEMisnotacommercialSIEMliketheMicroFocusArcSightusedbyEDP,noranopensourceplatformlikeElasticsearchusedbyAMADEUS.Instead,theXL-SIEM is a platform for research and development of new features andfunctionalities on SIEM environments. It is used to test and evaluate internaldevelopments performedwithinAtos, providing great benefit, since thewholeenvironmentiscontrolledanddevelopedinternallyattheAtoscybersecuritylab.Thesystemiscomposedbyasetofdistributedagents,responsiblefortheeventcollection,normalizationand transferprocesses; anengine, responsible for thefiltering,aggregation,andcorrelationoftheeventscollectedbytheagents,aswellas thegenerationof alarms; adatabase, responsibleof thedata storage; andadashboard,responsibleforthedatavisualizationinthewebgraphicalinterface.The XL-SIEMAgent is the resource layer component responsible for the eventcollection,normalizationandtransfer to theXL-SIEMEngine for itsprocessing.Eventsaregeneratedbydifferentsensors(e.g.,networktraffic,honeypotevents,intrusion detection systems) deployed on the customer's monitoredinfrastructure.TheXL-SIEMEngine is thecomponentontheback-end layerof themonitoringarchitectureresponsiblefortheanalysisandprocessingoftheeventscollectedbytheXL-SIEMAgents,andthegenerationofalarmsbasedonapredefinedsetofcorrelation rules or security directives. XL-SIEM is implemented in a Storm9topologyrunninginanApacheStormcluster10whichallowsittotakeadvantageofthescalabilitybenefitofthisarchitecture.TheCorrelationEngineisthecoreoftheXL-SIEMengineandintegratestheopensourcehigh-performance correlationengineEsper.11The correlatorusesEventProcessing Language 12 (EPL), which allows a flexible definition of complexcorrelationrules.ItisaSQL-likelanguagethatincludesforexamplethedetection9http://storm.apache.org/index.html10http://storm.apache.org/releases/2.0.0-SNAPSHOT/Setting-up-a-Storm-cluster.html11http://www.espertech.com/esper12https://docs.oracle.com/cd/E13157\01/wlevs/docs30/epl\guide/overview.html
D7.1
31
of patterns, the definition of datawindows or the aggregation and filtering ofincomingeventsintomorecomplexevents.TheXL-SIEMdatabaselayertakesadvantageoftheOSSIMplatformandstoragecapabilities.Moreover,itincludesthecapabilitytostorebothevents(gatheredbytheagents)andalarms(generatedbytheserver)inanexternaldatawarehouseusingaRabbitMQserver.TheformatsupportedtosendeventsandalarmsinthiscaseisJSON.The XL-SIEM web graphical interface integrates the following visualizationcapabilities:(i)differentgraphicalchartsareshowntotheuserasanoverviewofthemonitored system status; (ii) alarms, security events and raw logs can bevisualizedbytheuser;(iii)additionalinformationprovidedbyopensourcetools(e.g., Netflow traffic detected by Fprobe, vulnerabilities detected by Nessus orOpenVAS) is integrated inthegraphicalinterface;(iv) it ispossible togeneratePDFreportswithasummaryoftheSIEManalysis.AtosprovidesatestenvironmentcomposedbyaninstanceoftheXL-SIEMAgent(disiem-agent),theXL-SIEMengine,theXL-SIEMdatabaseanddashboard.ThereisnoproductionenvironmentdefinedinAtosfortheDiSIEMproject.Allresearchand development work are performed in the test environment, where newconnectorsandcomponentsareevaluatedandassessed.AtoshavesetupatestbedcomposedoftwoIntrusionDetectionSystems(IDS)–SnortandSuricata;twoapplications–OwnCloudandGitlab;oneXL-SIEMagent;andtheXL-SIEMengineaimingatcapturingthetrafficcomingfrominternalandexternalsourcesandlabellingitasnormaloranomalous,basedonthesourceIP.OwnCloud,Gitlab,andtheXL-SIEMAgenthavebeeninstalledinseparateVirtualMachines,eachonecontainingbothSnortv2.0.6andSuricatav3.2.4agents.ThetwoapplicationsandtheXL-SIEMagentarehostedinthesamenetwork,whereastheXL-SIEMengineisisolatedinadifferentnetworkprotectedbyafirewall,asshowninFigure11.
D7.1
32
Figure11–Atostestbed
The snort instance used to monitor the OwnCloud server was configured with only community.rules downloaded from the Snort site with all the rules uncommented. To monitor the GitLab server we have additionally included the registered snort rules files (as they are downloaded by default, with some of them commented). The internal subnet is composed of company users that accessOwnCloud andGitlabapplicationsregularlyandwhosebehaviourisconsiderednormalaslongastheyconnectfromwork.Anyconnectionperformedbyanoutsider(anetworkoriginoutsideoftheorganizationnetwork)isconsideredtobemalicious.Table5summarizestheplatformandsoftwarerequirementsfortheAtostestbed.Component Basesoftware Additional
requirementsDiSIEM-1OwnCloudServer
• Ubuntu• OwnCloudserverv.10.0.1• OSSECagent• Snortv.2.0.6• Suricatav.3.2.4
• Basicapplicationsimulatingloginpage
• Severaluserswithdifferentprivileges
DiSIEM-2GitLabServer
• Ubuntu• GitLabv.10.2.2• OSSECagent• Snortv.2.0.6• Suricatav.3.2.4
• Basicapplicationsimulatingloginpage
• Severaluserswithdifferentprivileges
D7.1
33
DiSIEM-FInternalfirewall+SIEMAgent
• Ubuntu• Netfilter• NIDS(Snort/Suricata)
• ConnectiontoXL-SIEMserver
XL-SIEMAgent • Ubuntu• Snortv.2.0.6• Suricatav.3.2.4RSyslog
• ConnectiontoXL-SIEMengine
• Snortcommunityrules
• SuricatadefaultrulesDiSIEM-3XL-SIEMEngine
• Debian• XL-SIEMserver• OSSECserver• OSSIM5.2.3(lastversion)• ApacheStorm1.0.2
• ConnectiontoCloud
platformforlong-termstorage
• ConnectiontoOSINTdata
AttackerMachine • Kali
• Runningoutofthetestbed
Table5–ATOSTestbedcomponentsandrequirements
3.3 Elastic Stack (AMADEUS)
ElasticStackisagroupofopensourceproductsbytheElasticcompany,designedto take data from any type of source, in any format and search, analyse, andvisualizethatdatainnear-realtime.Asmentionedin[D21],theElasticStackisnotaSIEMplatform,butoffersmostofSIEMcapabilitieswithgreaterflexibility,owinginparttoitsopensourcenature.Elasticsearch is an open-source, broadly-distributable, readily-scalable,enterprise-grade search engine. Accessible through an extensive and elaborateAPI, Elasticsearch can power extremely fast searches that support your datadiscoveryapplications.The Elastic Stack is used by Amadeus’ Application SOC team to monitor useractivitiesondifferentAmadeusapplications.TointegrateanewextensiontotheElasticStackplatform,anadministratorhasdifferentoptions,dependingonthetypeoftheextension:
• Elasticsearch REST API: The API is highly flexible and allows CRUDoperations,searches,aggregations,dataandplatformmanagement;
• ElasticStackdataingestionmodules:Logstash,BeatsEventGenerators;• Native integration with Elastic Stack components by creating custom
plugins.MostElasticStackcomponentssupportcustomplugins:Logstash,Kibana,etc.
TwoElasticStackPlatformsareusedbytheSOCteam:aproductionplatformandatestplatform.TheSOCteamatAmadeusisresponsiblefromtheimplementationandvalidationoftheDiSIEMcomponentsdeployedintheElasticStackplatform.
D7.1
34
Aswithotherdevelopments,allcomponentsmustbeinitiallydeployedinthetestenvironment and only then transferred to the production environment, if thevalidation is successful. An extension is validated and can be transferred toproduction,ifithasnoperformanceissuesandbringstheaddedvalueexpectedtoapplicationmonitoring.InadditiontotheElasticStackplatforms,AmadeushasGeneralPurposeplatformsfortestandproductionenvironments,usedtoruntoolsinteractingwiththeSIEMplatform without being natively integrated. The tables below summarize thedetailsofbothtestandproductionenvironments:
Platform Machines OS Software
General Purpose 2 VM Ubuntu 16.06, 100 GB
Disk, 16GB RAM, 4CPU Python 2.7/3 Java 8u171
Hadoop/Spark 3 VM Ubuntu 16.06, 100 GB Disk, 32GB RAM, 8 CPU
Spark 2.1.0 Hadoop 2.7.0 Java 8u171
SIEM 5 VM Ubuntu 16.04, 100 GB Disk, 16GB RAM, 4CPU Elastic Stack
Table6–Amadeustestenvironment
Platform Machines OS Software
GeneralPurpose 4VM Ubuntu16.06,150GBDisk,
16GBRAM,8CPUPython2.7/3Java8u171
Hadoop/Spark 6VM RedHatEnterpriseLinux7,2TBDisk,64GBRAM,32CPU
Spark2.1.0Hadoop2.7.0Java8u171
SIEM 10VMRedHatEnterpriseLinux7,5TBDisk,128GBRAM,
64CPUElasticStack
Table7–Amadeusproductionenvironment
3.4 Environment Validation
The component integration with the 3 available SIEM environments will beconductedusingasamplingapproach,asplanned,sincethecompleteintegrationandvalidationofallcomponentsinallSIEMplatformswouldrequireanamountofresourcesnotavailableintheproject.BymakingsurethatmostcomponentsareintegratedwithatleasttwoSIEMplatforms,andconsideringthefundamentaldifferencesbetweenallSIEMenvironments,onecaneffectivelydemonstratetheability to deploy the component in a flexiblemanner. Table 8 summarizes ourplanned integration forcomponentvalidation.Throughoutthevalidationphasesome adjustments may be made if integration issues occur or if the partnersdecidethatthecomponentmustbevalidatedinadditionalenvironments.
D7.1
35
Component TestEnvironments
Multi-levelRiskManager Amadeus,EDP
DiversityandForecastingAnalyticsPlatform Amadeus,Atos
OSINTThreatDetector Amadeus,Atos,EDP
Listening247ThreatAnalyser Amadeus,Atos,EDP
ThreatIntelligenceIntegrator Amadeus,Atos,EDP
Network-basedAnomalyDetector Atos
SkepticII Amadeus
UserBehaviourAnalyticsPlatform Amadeus
SLiCERCloudArchiver Atos,EDPTable8–Componentvalidationtestenvironments
Ontopofthecomprehensivedescriptionofcomponentinterfacesincludedinthisdeliverable,thevalidationoutputswillemphasizethemethodsusedtoguaranteetheadaptabilityofthoseinterfaces.
D7.1
36
4 Integration Plan
This chapterdefines the required interfaces (inputandoutput) for componentintegration. The components presented in Chapter 2, initially developed andimplemented in an isolated manner, will be integrated with the SIEMenvironments described in Chapter 3. Additionally, the interfaces betweencomponentswillalsobevalidated,makingsuretheoperationalcontext isvalidandseamlesslyworkingasawhole.Wewillpresent the integration interfacesas describedand represented in theDiSIEM architecture. It is relevant to note that not all components have directintegrationwiththeSIEM,asinthecaseoftheUserBehaviourAnalyticsPlatform.
4.1 Integration with SIEM
ComponentintegrationwiththeSIEMsystemisoneofthecoreactivitiesforthevalidation phase of the project. This integration can be accomplished directlybetween the developed component and the SIEM or in an indirect manner,through another component. In this section we focus on direct integrationsbetweencomponentsandtheSIEM.
4.1.1 Multi-level Risk Manager
TheMulti-levelRiskManagercomponenthasabidirectionalintegrationwiththeSIEM platforms. The DBImport module of this component will ingest asset,vulnerabilityandincidentdatafromtheSIEM.Thedesignedmethodistousethefile export capabilities of the SIEM in order to have the developed moduleaccessinga fileshareorsFTParea,processing itand importingthedatato thecomponent’sdatabase.TheresultsoftheriskassessmentcanbeexportedtotheSIEMandaDBExportsub-component was developed for that purpose, using the same fileexport/importmethod. However, wewill also implement amore efficient andseamless integrationtakingadvantageofavailableSIEMconnectors,havingtheSIEMaccessingthedatabasedirectlyandpullingtherelevant information fromthedatabasetotheSIEM,bypassingtheneedforafilerepository.
4.1.2 Diversity and Forecasting Analytics Platform
TheDiversityandForecastingAnalyticsPlatformtakeseventdataextractedfromtheSIEMdirectlyfromtheconnectors,forinstanceusingAPIcalls,performstheanalysis on its dashboard sub-component, calls its engine sub-component forforecastinganddisplaystheresultsbacktoitsdashboard.
4.1.3 Threat Intelligence Integrator
We’ll now detail, for each SIEM platform used by the DiSIEM partners, thedesigned integration plans for the OSINT Data Fusion & Analysis components,bearinginmindthattheyfollowtheDiSIEMprinciplesstatedin[D22],especiallythe one affirming that SIEM platforms should not be modified due to ourextensionsandnoadditionalorsignificantmanualworkshouldbe required tooperatethem:
D7.1
37
• XL-SIEM(ATOS):ATOSXL-SIEM,canexportand importevents through
theRabbitMQ13MessageModel.Forthispurpose,anewMISPmodulewillbe added to the MISP instance of the Threat Intelligence Integrator,allowinghimtocommunicatewiththisSIEMusingthismessagebroker.Themodulewillbealsoinchargeofhandlingtheconversionbetweenthedifferent standards used by MISP and the XL-SIEM for representingcybersecurityevents;
• ArcSight (EDP): EDP ArcSight is a SIEM widely used in industrialenvironments.Indeed,MISPcomesoutwithaspecificmoduledevelopedfortheintegrationwithArcSightitself,exchanginginformationusingtheCEF format, supported by this SIEM. Another option could be lettingArcSight interact directly with the Threat Intelligence Integrator, forpulling/pushing specific events from the MISP database, relying on theREST API that MISP itself provides. In this case, the python libraryPyMISP14couldbeusedformakingtheintegrationprocesseasier;
• Elasticsearch(AMADEUS):AMADEUSisrelyingonalocalMISPinstance,for their internal usage. The easiestway for performing the integrationwouldbetosynchronizetheinstanceoftheThreatIntelligenceIntegratorwiththisone,toexchangeinformationinanautomatedway,throughthebuilt-in information sharing capabilities of MISP. Alternatively, as forArcSight, Elasticsearch could pull/push data directly from the MISPdatabase, or through some custom import/exportmodule added to theinstanceoftheThreatIntelligenceIntegrator.
4.1.4 Diversity-enhanced Monitoring
SkepticIIisanEventGenerator,runinadifferentplatform,writingaggregatedapplicationusersessionstotheSIEM.SkepticIInativelysupportsJSONformatandisshippedwithanElasticStackconnector.CustomSIEMconnectorscanbeaddedtotheSkepticoutputmodule.
4.1.5 SLiCER Cloud Archiver
SLiCER has a passive interaction with the SIEM platform, simply receivingforwardedevents.TointegratewithSLiCER,itisnecessarytoconfiguretheSIEMto forward security events to the defined IP address and network port,whereSLiCERwillbe listening.The configurationmaybe changed in the “Slicer.conf”configurationfile,eitherdirectlyintheSLiCERfolderorthroughitsuserInterface.
4.2 Component Integration
Some of the developed components were designed to be tightly integrated,continuouslyexchanginginformationinordertoenableimprovedSOCoperations.Inthissection,wedescribehowthesecomponentensemblescanbeimplementedand validated. Figure 12 represents the DiSIEM components, accentuating thecomponentswithactiveinterfacesandnumberingthoseinterfaces.
13https://www.rabbitmq.com/14https://www.circl.lu/doc/misp/book.pdf
D7.1
38
Figure12–DiSIEMcomponentinterfaces
4.2.1 Diversity and Forecasting Analytics Platform
TheDiversityandForecastingAnalyticsDashboard retrieves thedata from theSIEMandcallstheEngineforthepredictions,usinginterface1.ThisfunctionalityisimplementedusingaRESTAPI.Themodellingandanalysisfunctionalities,thatare part of the Engine, are called from a prototype Python Flaskweb service,offering onemain endpoint, taking input data as a CSV or JSON and returningresultsasJSON.ThemodellingcodeintheEnginecanalsoberunasself-containedmodulesandcanstreamresultsoutinsyslogformat.
4.2.2 OSINT Data Fusion and Analysis Components
TheintegrationwithOSINTdatafusionandanalysistoolswillrelyonMISP.Boththe Listening247 Threat Analyser and the OSINT Threat Detector can providestructuredIoC,representedusingtheMISPJSONformat,totheThreatIntelligenceIntegrator,associatedtotheOSINTsourcestheyareusing.Theseinterfacesareidentifiedasnumber2and3,respectively.Theintegrationplanswillconsiderthefollowingguidelines:
- Listening 247 Threat Analyzer (DigitalMR): the Listening247 ThreatAnalyzer isnotusingMISP. It isnecessarytocreateauser,with limitedprivileges,intheMISPInstanceoftheThreatIntelligenceIntegrator.Inthisway, an authentication key has been generated, allowing this user to
D7.1
39
directly interacting with the MISP Database, for pulling/pulling eventsremotely,helpedbythePyMISP15library;
- OSINTThreatDetector(FCiências.ID): theOSINTThreatDetectorwillalsorelyonaprivateMISP Instance.This simplifies the sharingprocesswiththeThreatIntelligenceIntegrator,whichcanbeperformedsimplybysynchronizingthetwoMISPinstances,asexplainedintheMISPguide.16
Moreover,forimprovingtherepresentationofthesharedIoCsets,specificcustomMISPtaxonomieswillbecreatedandintegratedintheMISPinstanceoftheThreatIntelligenceIntegrator.Thiswillalsohelptheanalysismadebythelatter,duringthethreatscorecomputation.
4.2.3 Skeptic II
Figure13details thecommunicationphasesbetweenthetwocomponents.TheUserBehaviourAnalysiscomponentsingestdatafromtheSIEMextensionSkeptic.The output of these components is displayed via the VASABI visualisationextension.
Figure13SkepticIIintegrationwithUserBehaviourAnalyticsPlatform
15https://media.readthedocs.org/pdf/pymisp/master/pymisp.pdf16https://www.circl.lu/doc/misp/book.pdf
D7.1
40
5 Evaluation
Inordertoaccuratelyvalidatethedevelopedcomponentsandtheoverallsuccessof theDiSIEMproject, it isvital todefinecriteria thatallowstheconsortiumtomeasuretheeffectivenessoftheplannedactivities,beforetheyarestarted.Thissection defines some preliminary success criteria for component evaluation,alignedwiththeDiSIEMprinciplesandmethodologydescribedinSection1.1.TheevaluationitselfwillbeperformedduringtheintegrationtasksandreportedindeliverableD7.3(validationresults),onM36.AsdescribedinChapter3,thecomponentintegrationwillusetheavailableSIEMsystems, being firstly implemented in the test environments. In order to bedeployedintheproductionenvironment,thecomponentmustsatisfythesuccesscriteriadefinedinSection5.1.Thisqualityassurancemechanismisessentialtoguaranteethatthecriticalproductionenvironmentsofindustrialpartnersareinnowaynegativelyimpactedbypossiblebugsinthecomponents,aswellasmakingsurethatthecomponentshaveproventoberelevantfortheSOCoperationsandthattheydonotharmtheSIEMperformanceinanyway.
5.1 Success Criteria
The purpose of each component was detailed in Chapter 2, presenting themotivationforthedevelopmentandintegrationofthecomponentswiththeSIEMplatforms.Firstandforemost,thevalidationphasewilldetermineifthepurposeof the componentwas fulfilledandcompliedwith theexpectedoutcomes.Thisfunctionalanalysisisobviouslythemostrelevantcriterionandwillprecedeanyotherevaluation.ThesuccesscriteriaparametersarepresentedinTable9.Category Description SuccessCriteria
Functional Componentcapabilities
Thecomponentoperatesasexpectedandprovidesthenecessaryoutputs,enablingimprovedSIEMcapabilities
Functional SOCoperationimprovements
TheSIEMend-usersmustevaluatethebenefitsofintegratingthecomponentintheSIEMplatform.Performanceimprovementsshouldbemeasuredwheneverpossible
Technical ComplexityofSIEMintegration
ThecomponentmustbeintegratedconsideringtheDiSIEMprinciplesofleastimpact.TheSIEMplatformmanagersmustvalidatethattheintegrationeffortrequiredisadequate,consideringthebenefitsofusingthecomponentandtheimprovementsfortheSIEMplatform
D7.1
41
Technical Componentadaptability
OneoftheDiSIEMcornerstonesisthatalldevelopmentsmustbeconductedinordertoallowthecomponentstobeimplementedinmultipleSIEMplatforms.ThecomponentdevelopersmustmeasurethepercentageofthecomponentthatneedstoberedesignedoradaptedforadifferentSIEMplatform,todeterminetheadaptabilityofthecomponent
Functional Relevanceofthedevelopment
WhiletheteamsoperatingtheSIEMareabletogivedirectandimmediatefeedbackregardingthecomponents,themostrelevantsuccesscriterionistodetermineifthedevelopmentscontributedtoimprovetheorganizationcybersecuritycapabilities,includingforC-levelmanagementroles
Table9–Evaluationsuccesscriteria
5.2 SBD Validation Scenarios
Inthissection,wewillapplytheScenarioBaseDesignmethodologytodefinesomevalidation scenarios for one of our components: Skeptic II. Other componentsvalidationwillfollowsimilarideas.The first two phases of the SBD approach (Requirements and Design)will becovered in this report. The last phase (Evaluation)will be covered in the lastvalidationdeliverable.During theAnalysisPhase, a requirementanalysis isperformed.The idea is tounderstandtheproblemathandandidentifyareastoimprove.Variousmethodsareusedtostudythesituationandthecontext.Typically,thesemethodsincludeinterviewwithstakeholders,studiesofthecurrentsituationanditsimplementedsolutionandbrainstormingforothersolutions.Thefindingsweregatheredandanalysedinordertocomposethescenariosthatreflectimportantfeaturesoftheproblem, the typical and critical tasks of the users, the tools they use and theorganisationalcontext.FollowingtheValidationMethodologydescribedinSection1.1,weenvisiontwovalidationscenariosfortheSkepticIIextension:
• Scenario1:UserActivityDiscoveryandExploration• Scenario2:SuspiciousUserActivityDetection
Scenario1:UserActivityDiscoveryandExplorationAsecurityanalystfromtheSOCteamisperformingmanualinvestigationinordertoanalyzepotentiallysuspiciousactivitynoticedbyacustomer.Heislookingtofirstconfirmthesuspiciousactivityandtheninvestigateitsorigin.Theanalystisperformingsemi-automated investigation, consistingofextractingdifferentapplicationlogsfromvariouslocations,joiningthemandreconstructingthefull
D7.1
42
useractivityflow.Toperformthistask,theanalysthastoaccessthelogarchivenetworkshares,identifytherelevantapplicationlogarchivesandthenperformsearchesandparsingusingdifferenttools.Afterextractingtherelevanteventsfor the period provided by the customer, the analyst notices that he cannotconfirmtheuseractivityseeninthelogsissuspicious,heneedstoextractmorelogstocomparetheuseractionsseenwithhispastactions,butalsocompareitwithotheruseractivitieswithsimilarprofile.The analyst is particularly interested in the login time and location patterns,frequentactions,andactivityspikes,tochecktheuserhistory.The securityanalystneeds toprocess largeamountsof logs, so thework soonbecomestediousandtimeconsuming.HedecidestousetheElasticStacktoindexall relevant application events andexplore the results visually usingKibana.With Kibana dashboards built, the analyst quickly discovers that the userconnectiontimeandoriginchangedforthelastmonth,buthecannotconfirmthattheuseractionsareillegitimate.Hequicklyrealizesthatheneedsmoreadvancedvisualizationcapabilitiesmainlytomakesenseoftheuseractionflow.He suspects the user is performing fraudulent actions, but he needs moreinformationaboutthefunctionalflowoftheapplication.Hegetsintouchwiththeteaminchargeoftheapplicationtobetterunderstandtheactionflow.Finally,heisabletoconfirmthattheuseractivityisfraudulentandthattheuserwas exploiting a loophole in the application design. He opens a ticket and asubsequentpatchwasappliedtotheapplicationandthecustomerwasnotified.Although, the issuewas fixed for the application in question, no changesweremadetothedifferentdiscoveryand investigationtools.Theanalystknowsthatnothavingafeedbacklooptoimprovetheefficiencyoftheinvestigationtoolsfromthe results of the past investigation is sub-optimal. Table 10 summarizes theclaimsforScenario1:
Claim Pros(+)/Cons(-) Category
Claim1.InvestigationApproach
(+)analysthasknowledgeaboutcertainfeaturesandmetricsthatcanindicatefraudulentactivities(+)theprocesstodetectsuspiciousactionsissemi-automated(-)theprocessistime-consuming(-)nopossibilitytorunadetectionapproachformultipleapplicationinaconfigurablemanner
Analytics
Claim2.Visualizationscapabilities
(+)thesecurityanalystisablevisualizeapplicationeventswithKibanaforbetterunderstandingofuseractivities(-)noadvancedvisualizationstoreflectthefunctionalflowofanapplication,peer/groupsanalysis.
Visualizations
D7.1
43
(-)noKibanavisualizationsincorporatedincentralizeddashboards
Claim3.FunctionalKnowledge
(-)theanalystishighlydependentonfunctionalknowledge;thisslowsdowntheinvestigation.
Dependencies
Claim4.Userhistory
(+)theanalysthasanoverviewofuserfrequentbehavior(-)partialuserhistory
Analytics
Table10–Scenario1Claims
In theDesign Phase, the project ismoved from problem understanding to theenvisionedsolutions.Developersstartbywritingdownthescenarioinwhichtheyenvision typical or critical services that people sought from the system andintroducedconcreteideasaboutthenewfunctionality.For Scenario 1, the analystwould like to perform the same type of suspiciousactivity discovery and exploration in a more efficient way. When he receivesnotificationforanalertforaspecificapplication,hewouldliketohaveinplaceatoolthatwouldallowhimtohaveallrelevantinformationforhisinvestigationgatheredinoneplace.The analystwants focus on the actual investigation and not on log extractionparsingorbuildingvisualizations.Thetoolwouldallowtheanalysttoselecttheapplication to investigate, the locationof logsto ingest, the session features toextractandthetimerangehe is interested in.Thenthetoolswouldgathertheneeded information and present it in a centralized visualization tool, such asKibana, butwithadvanced visualization capabilities to allow the analyst tobetterunderstandtheflowevenifheisnotfamiliarwiththeapplicationitself.TheanalystwantsthetooltoallowaseamlessintegrationofOSINTdatafeedssuchasIP/Domainreputation,knownvulnerabilitiesdatabases,targetedattackcampaigns,etc.Thetoolwouldallowthedefinitionofcustomrulestoreflectexpertknowledgehelping theanalystunderstanding theuser flow.Finally, the toolwouldhaveafeedbackloopallowingtheanalysttoincorporatetheknowledgegatheredfrompreviousinvestigationsintothetools,thisenrichingitsfinaloutputresult.Scenario2:SuspiciousUserActivityDetectionInadditiontotheinvestigationanddiscoverytasksthesecurityanalystperformson a regular basis, he receives alerts from some continuous user activitymonitoringtoolsthatheneedstocheck.Mostofthemonitoringtoolsforsuspiciousactivitydetectionareeitherrulebasedorusesimplestatisticalmodelsonbasicapplicationsessionmetrics.Althoughhavingsomebasicmonitoringinplaceimprovesthechancestodetectandstopfraudulentuseractivities,theanalystknowsthatthedetectionrulesandmodelsarenotfrequentlyupdatedandcannotcovermostofthepotentialattackscenarios.
D7.1
44
Mostmonitoringtoolsusebasicapplicationlogeventsanddonotenrichthedataingested, therefore theanalyst is stillobligedtoperformmanual lookups fromvariousOSINTsourcestofurtherinspectthealertsreceived.Also,theuserhistoryisrarelytakenintoconsiderationbythemonitoringtools,thus the analyst is still obliged to manually gather more data to build anapproximate profile of theuser activity patterns. Furthermore, alerts from themonitoringtoolsareusuallypresentedinsimpledashboardsthatlackadvancedvisualisationscapabilitiesandproperreportingtools.Theanalystisunabletodisplayalltheinformationheneedsinaconciseandfriendlymanner.Table11summarizestheclaimsfromScenario2:
Claim Pros(+)/Cons(-) CategoryClaim1.Reportingtools (-)Noproperreportingtoolinplace Analytics
Claim2.Visualisationscapabilities
(+)SomevisualizationispossibleinKibana(-)thevisualizationsordashboardsdonotcontainalltheneededinformation(-)thevisualizationsanddashboardsarealsonotfrequentlyupdated(-)moreadvancedvisualizationsneeded.Thesevisualizationsshouldcontainconciseinformationabouttheusersessioninareadableformat.
Visualizations
Claim3.Rules
(+)Someautomatedrulesareinplace(-)therulesareappliedonsimplefeatures(-)nocomplexruleshavebeenderived(-)significantamountoftimetotesttheefficiencyofanewrule(-)rulesarenotupdated
Analytics
Claim4.DetectionApproach
(+)analysthasknowledgeaboutcertainfeaturesandmetricsthatcanindicatefraudulentactivities(+)theprocesstodetectsuspiciousactionsisautomated(-)theprocessistime-consuming(-)UEBAcanbeappliedtodetectnewsuspiciousactivitypatterns(-)noadvancedfeatures/metricsareusedforthedetection(-)manyapplicationstomonitorinparallel(-)needtohavethepossibilitylaunchcustomdetectionmodels/featuresontheapplication.
Analytics
Claim5.Userhistory (-)userhistoryisnotconsidered Analytics
D7.1
45
Claim6.OSINTdatafeedrelatedtoapplication
(-)needtoenrichthecurrentusersessionwithexternaldatarelatedtoapplication(e.g.OSINTfeeds)
OSINTfeed
Table11–Scenario2Claims
Ideally,thecontinuoususeractivitymonitoringtoolswouldincludebotharule-basedandabehaviouralbaseddetectionmechanism.The rule-baseddetectionmechanism should allow for flexible rulemanagement including frequent ruleupdatesandfinetuning.ThebehaviouralbasedmechanismshouldallowthedefinitionofadvancedUEBAmodelsoncustomsession featuresandshouldbe frequentlyupdatedandfine-tunedtoreflectthecurrentuseractionpatterns.Thedetectiontoolsshouldalsoallowtheanalysttohavealltheenrichedresultsaccessible from a centralized visualisation tool with advanced visualizationscapabilities,allowingtheSOCteamstohaveaccuratereportsandafine-grainedunderstandingofuseractivities.The detection tools need to allow for an easy integration with external datasourcessuchasOSINTdatafeeds,GeoIPlookupdatabases,etc.Finally,anidealuseractivitymonitoringtoolshouldbehighlyflexible,allowingtheusertospecifyallparametersspecifictoaparticularapplicationmonitored:inputloglocation,UBAmodels,sessionfeatures,UBAmodelparameters,Rules,session output location, score aggregation parameters, etc. Such scenarios andclaimswillbere-evaluatedinthenextdeliverable,andtheprogresswillbenotedandmeasured.
5.3 Ethical and Legal Aspects
SIEM platforms are used by organizations to collect, process and store largequantitiesofsecurityevents,which includemultiple typesof information fromnumeroussources.Whilemostof thecollected informationcanbeclassifiedastechnical data, the volume of private information, more specifically personalinformation,isalsorelevant.ThecomponentsdevelopedinDiSIEMwillcollectandprocessdatathatislegallyprotected,forinstancebytheGDPR.17Partoftheinformationisoriginatedintheorganization’sinternalsystems(e.g.,useractivity,Internetaccesslogs,physicalaccesslogs)butthereisalsoinformationbeingcollectedfrompublicnetworks,mainlytheInternet(e.g.,Twitterdata,threatintelligencedata).Regardlessoftheinformation origin, the developers must take into consideration privacyrequirementswhendesigningandimplementingthecomponents.
17https://ec.europa.eu/info/law/law-topic/data-protection_en
D7.1
46
Duringthecomponentintegrationandevaluationphase,ethicalandlegalissuesmayarise.Whileeachpartnerorganizationisabletoaddresslegalquestionstointernallegaldepartments,itisourobjectivetomaintainpermanentcontactwiththeEthicsadvisordesignatedfortheproject.Wewilladdressanydoubtsinthisfieldtotheadvisor,makingsurethatcomponentimplementationwillnotleadtoanydoubtfuldecisionsregardinginformationprocessinganddissemination.
D7.1
47
6 Conclusions and Future Work
In this reportwe summarized theDiSIEMpreviously defined architecture anddetailed the role of each developed component, stressing their most relevantcharacteristics, objectives and interfaces. The validation environments madeavailable by the three industry partners (Amadeus, Atos and EDP) were alsodetailed,allowingforaclearervisiononhowthecomponentdemonstratorswillbeevaluated.Thesuccesscriteriafortheintegrationandvalidationphaseswerelaidout,defining thekeyaspects thatmustbeconsideredwhenevaluating thereadinessandrelevanceofthecomponents.Webelievethatbyintegratingthecomponentsinthepartner’soperationalSIEMenvironments,DiSIEMwillprovidesignificantinsightsontheirrelevancetoSOCoperations, making the transition from theoretical models to real-worldimplementations,preciselyasdefinedintheprojectgoals.ThisdocumentwillbeusedasareferencefortheintegrationactivitiesofDiSIEMthroughouttherestoftheproject.Considering that the components will be integrated in the corporateenvironments of industrial partners, there are some inevitable limitationsregarding the dissemination of results. The SIEM platform is, by definition,responsibleforprocessingandstoringcriticalinformation,includingdetailsforsecurityincidentsandvulnerabilities.SomeofthecomponentswillexploreSIEMdataandpredictablyaccessconfidentialinformation.Therefore,whenreportingoncomponentevaluation,itisnecessarytoredactpartoftheinformation,withoutcompromisingtheabilitytoassesstheprojectresults.One of the consortium’s aspirations is to include relevant stakeholders in thecomponentimplementationandintegration,maximizingthevalidationscenariosand welcoming useful feedback to improve the components. Each industrialpartnerwilladdressthisobjectivebyinteractingwithbothinternalandexternalstakeholders,whichmay include clients,differentbusinessareasandpartners.ThemoststraightforwardexampleswouldbetoincludeSOCclients,internalorexternal, and SIEM maintenance teams, especially if they include external orsubcontracted resources. Results dissemination will follow the methodologydefinedfortheDiSIEMproject,asdescribedinWP8deliverables.
D7.1
48
References
[AblonandBogart2017]L.AblonandA.Bogart.ZeroDays,Thousandsof Nights:The Life and Times of Zero-Day Vulnerabilities and Their Exploits. RandCorporation, 2017.[Bessanietal.2013]A.Bessanietal.DepSky:dependableandsecurestorageinacloud-of-clouds.ACMTransactionsonStorage(TOS),9(4),12.2013.[Carroll 2000] J. M. Carroll. Making use: scenarios and scenario-based design.Proceedingsofthe3rdconferenceonDesigning interactivesystems:processes,practices,methods,andtechniques(DIS’00).2000.[Oliveiraetal.2016]T.Oliveiraetal.ExploringKey-ValueStoresinMulti-WriterByzantine-ResilientRegisterEmulations.Proceedingsofthe 20th InternationalConferenceonPrinciplesofDistributedSystems (OPODIS’16).2016.[D21] DiSIEM Consortium. In-depth Analysis of SIEMs Extensibility. DiSIEMProjectDeliverable2.1.February2017.[D22]DiSIEMConsortium.ReferenceArchitectureandIntegrationPlan.DiSIEMProjectDeliverable2.2.May2018.[D31]DiSIEMConsortium.SecurityMetricsandMeasurements.DiSIEMProjectDeliverable3.1.November2017.[D41] DiSIEM Consortium. Techniques and Tools for OSINT-based ThreatAnalysis.DiSIEMProjectDeliverable4.1.August2017.[D52]DiSIEMConsortium.Use casedemonstrators.DiSIEMProjectDeliverable5.2.August2018.[D61] DiSIEM Consortium. Preliminary Architecture and Service Model ofInfrastructureEnhancements.DiSIEMProjectDeliverable6.1.August2017.[D62]DiSIEMConsortium.Use casedemonstrators.DiSIEMProjectDeliverable6.2.August2018.