us ndc modernization iteration e1 prototyping report...
TRANSCRIPT
US NDC ModernizationSAND-xxxxUnclassified Unlimited ReleaseDecember 2014
US NDC Modernization Iteration E1 Prototyping Report: Processing Control Framework
Version 1.1
Prepared bySandia National LaboratoriesAlbuquerque, New Mexico 87185 and Livermore, California 94550
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Approved for public release; further dissemination unlimited.
SAND2014-20569R
SAND-xxxx Page 2 of 42
NOTICE: This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government, nor any agency thereof, nor any of their employees, nor any of their contractors, subcontractors, or their employees, make any warranty, express or implied, or assume any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represent that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government, any agency thereof, or any of their contractors or subcontractors. The views and opinions expressed herein do not necessarily state or reflect those of the United States Government, any agency thereof, or any of their contractors.
SAND-xxxx Page 3 of 42
SAND-xxxxDecember2014
US NDC Modernization Iteration E1 Prototyping Report:
Processing Control Framework
RyanPrescottBenjaminR.Hamlet
Version1.11SandiaNationalLaboratories
P.O.Box5800Albuquerque,NewMexico87185
ABSTRACT
DuringthefirstiterationoftheUSNDCModernizationElaborationphase(E1),theSNLUSNDCmodernizationprojectteamdeveloped aninitialsurveyofapplicableCOTSsolutions,andestablished exploratoryprototypingrelatedtothe processingcontrolframeworkinsupportofsystemarchitecturedefinition.Thisreportsummarizestheseactivitiesanddiscussesplannedfollow-onwork.
REVISIONS DECEMBER 2014
SAND-xxxx Page 4 of 42
REVISIONS
Version Date Author/Team Revision Description Authorized by
1.0 3/21/2014 USNDCModernizationTeam InitialRelease M.Harris
1.1 12/19/2014 IDCReengineeringTeam IDCRelease M.Harris
TABLE OF CONTENTS DECEMBER 2014
SAND-xxxx Page 5 of 42
TABLE OF CONTENTS
US NDC Modernization Iteration E1 Prototyping Report: Processing Control Framework ............................................................................................................ 3
Abstract ................................................................................................................. 3
Revisions ............................................................................................................... 4
Table of Contents .................................................................................................. 5
1. Overview ......................................................................................................... 8
2. Schedule.......................................................................................................... 8
3. Motivation ...................................................................................................... 9
4. Processing Control Framework ....................................................................... 9
4.1. Definition ......................................................................................................................9
4.2. Design Goals .................................................................................................................9
4.3. Constraints..................................................................................................................10
4.4. Iteration E1 Prototyping Activities...............................................................................10
4.4.1. Initial COTS Survey .................................................................................................10
4.4.1.1. Stream Processor Frameworks ........................................................................11
4.4.1.1.1. Apache Storm ...........................................................................................12
4.4.1.1.2. Apache S4 .................................................................................................13
4.4.1.1.3. Apache Samza...........................................................................................13
4.4.1.2. Java Application Frameworks ..........................................................................14
4.4.1.2.1. Java EE ......................................................................................................15
4.4.1.2.2. Spring Framework.....................................................................................15
4.4.1.2.3. Application Servers ...................................................................................16
4.4.1.2.3.1. Wildfly...................................................................................................16
4.4.1.2.3.2. GlassFish ...............................................................................................17
4.4.1.2.3.3. Apache Tomcat .....................................................................................17
TABLE OF CONTENTS DECEMBER 2014
SAND-xxxx Page 6 of 42
4.4.1.2.3.4. Jetty ......................................................................................................17
4.4.1.2.3.5. WebLogic ..............................................................................................17
4.4.1.2.3.6. WebSphere ...........................................................................................18
4.4.1.2.3.7. Conclusions ...........................................................................................18
4.4.1.3. Enterprise Service Bus .....................................................................................19
4.4.1.3.1. WS02 ........................................................................................................19
4.4.1.4. Complex Event Processor Frameworks ............................................................20
4.4.1.4.1. Esper.........................................................................................................20
4.4.2. Exploratory Prototyping.........................................................................................20
4.4.2.1. Apache Storm..................................................................................................21
4.4.2.1.1. Background...............................................................................................21
4.4.2.1.1.1. Processing Model ..................................................................................21
4.4.2.1.1.2. Clustering..............................................................................................22
4.4.2.1.1.3. Fault Tolerance .....................................................................................23
4.4.2.1.1.4. Process Monitoring ...............................................................................24
4.4.2.1.2. Cluster Configurations ..............................................................................24
4.4.2.1.3. Topology Configurations ...........................................................................25
4.4.2.1.3.1. Prototyped spouts and bolts..................................................................25
4.4.2.1.3.2. Specifying Processing Guarantees .........................................................26
4.4.2.1.3.3. Assigning tuples to Processing Tasks......................................................27
4.4.2.1.4. Type System..............................................................................................27
4.4.2.1.5. Multilanguage Support .............................................................................28
4.4.2.1.6. Serialization and Messaging ......................................................................28
4.4.2.1.7. Summary ..................................................................................................29
4.4.2.2. Java EE/Wildfly 8 .............................................................................................29
4.4.2.2.1. Background...............................................................................................29
4.4.2.2.1.1. Containers & EJBs..................................................................................29
TABLE OF CONTENTS DECEMBER 2014
SAND-xxxx Page 7 of 42
4.4.2.2.1.2. Dependency Injection ...........................................................................30
4.4.2.2.1.3. Wildfly Server Configuration .................................................................30
4.4.2.2.1.4. Messaging .............................................................................................31
4.4.2.2.2. Exploratory Prototype...............................................................................32
4.4.2.2.2.1. Wildfly Server Management..................................................................32
4.4.2.2.2.2. Mock Seismic Pipeline ...........................................................................33
4.4.2.2.2.3. Limitations ............................................................................................34
4.4.2.2.2.4. Conclusions ...........................................................................................34
4.5. Follow-On Work..........................................................................................................35
4.5.1. Explore additional PCF solutions ............................................................................35
4.5.2. Assess custom PCF solutions ..................................................................................35
4.5.3. Select a PCF solution for the executable architecture prototype............................36
4.5.4. Develop a Basic Processing Pipeline Prototype ......................................................36
Appendix A. Comparison of Prototype and Existing System Processing Control Frameworks 37
References........................................................................................................... 39
OVERVIEW DECEMBER 2014
SAND-xxxx Page 8 of 42
1. OVERVIEW
TheUSNDCModernizationprojectstatementofworkidentifies thedefinitionofamodernizedsystemarchitectureasacentralprojectdeliverable.Aspartofthearchitecturedefinitionactivity,theSandiaNationalLaboratories(SNL)projectteamhasestablishedanongoing,softwareprototypingefforttosupportarchitecturetradesandanalyses,aswellasselectionofcoresoftwaretechnologies.
DuringthefirstiterationoftheUSNDCModernizationElaborationphase(E1),spanningQ1- Q2FY2014,theprototypingeffortincludedinitialCOTSsurveysandexploratoryprototypingaddressingthreecoreelementsofthesystemarchitecture:
1. TheCommonObjectInterface(COI) providesthesystemandresearchtoolswithaccesstopersistentdataviaanabstractionoftheunderlyingstoragesolutions.
2. Theprocessingcontrolframework providesforthedefinition,configuration,executionandcontrolofprocessingcomponentswithinthesystem,supportingbothautomatedprocessingandinteractiveanalysis.
3. TheUserInterfaceFramework (UIF) providesaflexibleplatformforthedefinitionofextensiblegraphicaluserinterface(GUI)components&compositionofGUIdisplayssupportingusersofthesystemandresearchtools.
ThisreportsummarizestheiterationE1prototypingactivities oftheSNLprojectteamspecifictothe processingcontrolframework. E1prototypingactivitiesfortheCOI andUIF aredescribedinseparatereports.
2. SCHEDULE
Thisreportsummarizestheprocessingcontrolprototypingworkcompletedduringthethree-monthperiodfromDecember2013toFebruary2014,basedonthefollowingschedule.
Period Activity
December 2013 OSS/COTS survey
January – February 2014 Initial Exploratory Prototyping
DECEMBER 2014
SAND-xxxx Page 9 of 42
3. MOTIVATION
Prototypingprovidesinputcriticalinthedefinitionofthesystemarchitecture,supporting selectionofcoresoftwaredevelopmentlanguagesandtechnologies,identificationofarchitectureconstraints&assumptions,anddefinitionofhigh-leveldesignpatterns. Inaddition,theprototypingactivityprovidesafoundationfordevelopmentoftheexecutablearchitecturedeliverable.
4. PROCESSING CONTROL FRAMEWORK
4.1. Definition
Theprocessingcontrolframework isasoftwaremechanismproviding forthedefinition,configuration,executionandcontrolofsystem processingcomponents,supportingbothautomatedandinteractiveanalysisprocessing.Theprocessingcontrolframeworkincludesthefollowingelements:
1. Aninterfacefordefiningautomatedprocessingcomponents&processingtopologies
2. Aruntimeenvironmentsupportingdeployment,execution,monitoringandcontrolofprocessingtopologies
Notethattheprocessingcontrolframeworkmayencompassmultiplesolutionssupportingdifferenttypesofprocessingwithintheautomatedandinteractiveanalysisworkflows(e.g.near-real-timevs.batch&interactiveprocessingmodels).
4.2. Design Goals
Provideafault-tolerant,horizontallyscalableprocessingmodel
Provideorsupportameans fordefiningandconfiguringprocessingsequences
Provideaninterfaceabstractiontofacilitateintegrationofnewprocessingalgorithmimplementations
Provideamessagingframeworkforcommunicationofdataandprocessingcontrolinformationamongprocessingcomponents
Supportprocessingcomponentsimplementedinthelanguages tobe definedforthemodernizedsystem
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 10 of 42
4.3. Constraints
COTS:PreferOpenSourceSoftware(OSS)andotherCommercialOff-The-Shelf(COTS)solutionstocustomsoftwaredevelopmentwhereavailable.
Standards:Prefersolutionsbasedonopenstandardswhereverpossible.
4.4. Iteration E1 Prototyping Activities
IterationE1prototypingactivitiesfocusedonsurveyingCOTSsoftwaresolutions(principallyopensourcesoftware)addressingtherequirementsandconstraintsidentifiedthusfarfortheprocessingcontrolframework.CandidatesolutionswereidentifiedthroughonlineresearchintoavailableCOTS/opensourcetools,andthroughdiscussionswithotherSNLprojectteamsknowledgeableinCOTSsolutionsfor similarapplications.
Notethatthesurveyresultspresentedherearenotexhaustive;theyrepresentaninitialeffortconstrainedto theavailableE1scheduleandstaffingresources.Identificationandevaluationofcandidatesoftwaresolutionsisintended tobe anongoingactivityduringtheelaborationphase,as developmentofthearchitecturedefinitionandexecutablearchitectureprototype progress. Section0 identifiesadditionalsurveyworkscheduledforiterationE2andbeyond.
Thesurveyeffortincludedafirst-orderassessmentofcandidates toeliminatethosesolutionsnotwellsuitedtotheUSNDCandIDCapplications.Additionalinvestigation,includinglimitedexploratoryprototyping,wasconductedforpromising candidatesnoteliminatedaspartofthesurvey.Similarfollow-oninvestigationsmay beconductedforothercandidatesaspartoffuturework.
4.4.1. Initial COTS Survey
Thecandidatessurveyed aspartoftheE1prototypingworkcanbeorganizedintofour categories:
1. StreamProcessorFrameworks
Streamprocessingframeworksarelargelyfocusedonprovidinginfrastructureforreal-timedataanalyticsonunboundedstreamsofcontinuouslyarrivingdata.Theyfacilitatereal-time,ratherthanbatchmode,clusterbasedparallelcomputation.ThestreamprocessingframeworkssurveyedareApacheStorm [1],ApacheS4 [2],andApacheSamza [3].
2. JavaApplicationFrameworks
TheJavaEE[24]andSpring[25]applicationframeworksprovidegeneralsupportforthedevelopment,configuration, deploymentand
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 11 of 42
managementofscalable,secure,distributedapplications,includingbothautomatedandinteractiveprocessing.Bothframeworksarewidelyusedinindustry,andarewellsuitedforthedevelopmentofserver-sideapplications.
3. EnterpriseServiceBusFrameworks
EnterpriseServiceBusimplementationsprovidegeneralsupportforsystemsbuiltusingServiceOrientedArchitecture(SOA).ESBs canbethoughtofasprovidingservicesused bytheprimarysystemservices.TheWS02ESB[5]wassurveyedforthisprototype.AnotherESB,Mule[6],wasusedintheSOAproofofconceptprojectcompletedduringInceptionIteration2.
4. ComplexEventProcessors
ComplexEventProcessorsaresimilartoStreamProcessorsastheyfocusonreal-timedataanalyticsforcontinuouslyarrivingdata.ComplexEventProcessorsaregenerallybasedaroundaqueryengineusedtoselectstreamdataforprocessing.ThisapproachlendsitselftodynamictopologiesthatevolvewiththeprocessingresultsanddataarrivingonthestreamwhereasStreamProcessorstendtousestaticallyconfiguredtopologiesthatare runonallincomingstreamdata.TheEsperComplexEventProcessor [4]wassurveyed.
ThecandidatessurveyedreflectamovetomoderndevelopmentlanguageswithintheJavaVirtualMachine (JVM) ecosystem,principallyJava.ThedominanceofJavaamongthecandidatesisareflectionofitsprominencewithinthesolutionspace.
Thecandidates alsoreflectanalignmenttotheindustrystateofpracticeformissioncritical applicationdevelopmentwherestabilityand maturityareimportantfactorstobebalancedagainstcuttingedgeinnovation.Giventherequiredlongevityofthemodernizedsystem,COTSsolutionswithgreaterprevalenceandlargerdevelopmentcommunitieswerepreferredtonewer,lesswell-establishedofferings.
Candidateswereassessedbasedonthequalityandapplicabilityoftheirfeaturesets,aswellastheirmaturityandtheapparentstrengthoftheiruser/developmentcommunities. Surveyresultsaresummarizedforeachcandidateinthefollowingsections.
4.4.1.1. Stream Processor Frameworks
ThethreeStreamProcessingFrameworkssurveyedapproachtheproblemofreal-timedataanalyticsforstreamsofincomingdatainsimilarways.Eachof
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 12 of 42
theproductswasactivelydevelopedbywell-knowninternetcompanies, andeachare currentlyopensourceprojectsmanagedbyApache.Becausetheproductsaresosimilar,differentiatingamongtheminvolvesdownselectionbasedoneitherspecificimplementationsofakeyfeatureorbasedonqualitieslikeindustryadoption,usersupport,orpotentialforlongevity.
4.4.1.1.1. Apache Storm
ApacheStormisanopensourceprojectoriginallyreleasedtoGitHubbyTwitterinSeptember2011.IthasbeenmanagedasanApacheIncubatorprojectsinceSeptember2013.Stormiscurrentlyandactivelyunderdevelopment.Storm0.9.01,releasedinDecember, 2013,wasusedforthisprototype.
Stormprocessingisorganizedasatopology(i.e.adataflowgraph)ofspouts thatprovidedataandboltsthatperformprocessing.Figure1 depictsasimpleStormtopology.WhileStormrunsintheJVM,ithasseveralbuiltinfeaturessupportingdevelopmentinnon-JVMlanguages.TheprocessingtopologiesusedbyStormaredefinedusingalanguageindependentformat(ApacheThrift [12])thatallowsStormtopologiestobedevelopedinawiderangeofprogramminglanguages.Stormalsohasbuilt-insupportforaccessingspouts and boltsdevelopedusingnon-JVMlanguages throughthemultilangprotocol.
Figure 1. Storm topologies are directed graphs where the edges represent data tuples flowingbetween spouts and bolts. 1
StormhasapluggablemessagingsystemwithexistingimplementationsforZeroMQ [10] andNetty [11].EventhoughthesemessagingsystemsdonothavethestrongreliabilityguaranteesofprojectslikeKafka[7]andZookeeper [9],Stormprovidesavarietyofprocessingguaranteesthatinclude“alldataisprocessedatleastonce”and“alldataisprocessedexactlyonce”.Processingcomponentswrittenbyusersshouldersomeofthebookkeepingresponsibilitiestoachieve“atleastonce”processingguaranteesasStormrequiresexplicitacknowledgementwhenboltsfinishprocessingatuple.“Exactlyonce”
1 ReproducedfromtheStormtutorial(http://storm.incubator.apache.org/documentation/Tutorial.html),whichisavailableontheApacheSoftwareFoundation’sStormwebsite[1].
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 13 of 42
processingguaranteesareachievedthroughaseparatetopologydefinitionAPI(knownasTrident)thatdoesnotrequirebookkeepinginclientcodes.
Stormprovidesacustomclusterresourcemanagementsystem.ThissystemusesZookeepertocoordinateclusternodesandisresilienttonodeandprocessfailures.Resourceallocationisdefinedstaticallybutcanbeupdateddynamicallyusingacommandlineinterface.Sincethisrequiresmanualintervention,Stormdoesnotprovideautomatic,dynamic,processingelasticity.
Stormhasexperiencedhighadoptionintheopensourcecommunityandisusedbyseveralprominentcompaniesfordataanalyticstasks.
4.4.1.1.2. Apache S4
ApacheS4isanopensourceprojectreleasedbyYahooinOctober,2010.IthasbeenanApacheIncubatorprojectsinceSeptember2011.
S4topologiesaredefinedinJavacodeasagraphofProcessingElementsandDataStreamsconnectingtheProcessingElements.
S4usesZookeeperforthecommunicationlayer,whichprovidespersistentanddurablemessagingtoS4.S4processingtopologiesarewritteninJava.S4implementsacheck-pointingsystemtoprovide processingguaranteesandtopreventdataloss.
S4usesApacheHelix [13] forclusterresourcemanagement,loadbalancing,dynamicresourcescalability,andfaulttolerance.
Despitebeingavailablepubliclyforseveralyears,S4hasnotfosteredwidespreadinterestintheopensourcecommunity.Thisfactbringslongtermsupportanddevelopmentoftheprojectintoquestion.
4.4.1.1.3. Apache Samza
ApacheSamza isanopensourceprojectreleasedpublicallyinSeptember,2013.ItwasoriginallydevelopedbyLinkedIntosupporttheirreal-timedataanalyticsneeds.
Samzatopologiesaredefinedinconfigurationfiles.Thereisnobuilt-insupportforrunningnon-Javacomponentswithinprocessingtopologies.Samzacanprovideguaranteesthat“alldataisprocessedatleastonce”or“alldataisprocessedexactlyonce”byatopologywithoutexposingthebookkeepingtasksorstatestoclientcodes.
SamzausesApacheKafkaformessaging.Kafkaprovidesreliablemessagingwithguaranteedmessagedelivery.Kafkaisitselfadistributedapplicationandprovidesparallelmessageprocessing,loadbalancing,andcertainmessage
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 14 of 42
deliveryorderguarantees.SinceKafkamessagesarebackedbyZookeeper,whichwritestodisk,Samzanodesorprocessescanfailwithoutlosingmessagesandwithoutmessagesthatarependingdeliveryoverwhelmingtheavailablesystemmemory.KeepingallmessagespersistedtodiskfacilitatesdatareplaythroughSamzatopologies.
SamzausesApacheYARN[8]forresourcemanagement.YARNisacomputeclusterresourcemanagementandtaskschedulingutilityoriginallydevelopedforthepopularHadoopmap-reducebatchprocessingproject.SamzabenefitsfromYARN’sloadbalancing,nodemanagement,andprocessandresourceisolationfeatures.Samzadoesnotsupportdynamicelasticityforclusterresourceallocation.
Samzahasnotbeenavailableasopensourcesoftwarelongenoughtoaccuratelygaugefuturecommunityinterest,support,orlong-termviability.
4.4.1.2. Java Application Frameworks
TheJavaapplicationframeworkssurveyedinE1representthetwomostprevalententerprisejavaapplicationdevelopmentsolutions:JavaEnterpriseEdition(JavaEE),andSpring.
JavaEEandSpringwereincludedinthesurveybecausebothprovidetechnologiesforthedevelopmentofmodularprocessingarchitecturescomposedofloosely-coupled,distributedprocessingcomponentsinteractingthroughwell-definedinterfaces.JavaEEandSpringprovideaverysimilarsetofcapabilities,andgenerallyfollowsimilardesignpatterns.SpringhasinfluencedtherecentevolutionofanumberofJavaEEstandards.
ItshouldbenotedthatbothJavaEEandSpringprovidemanyadditionalcapabilitiesofinteresttothemodernizedsystem architecturethatfalloutsidethescopeoftheprocessingcontrolframeworkandsoarenotaddressedhere.Oneexamplerelatestodatapersistenceabstractions(e.g.JavaPersistenceAPI,ObjectRelationalMappings&DataAccessObjects),whichareaddressedseparatelyaspartoftheCommonObjectInterface(COI)prototypingeffort.Anotherexample relatestoclient presentationframeworks(e.g.JavaServerFaces,StrutsandSpringModel-View-Controller),whichareaddressedaspartoftheuserinterfaceframeworkprototypingeffort.
BothJavaEEandSpringaretypicallydeployedusinganapplicationserver,which providesanimplementationofthecorefunctionsoftheapplicationframework.AnassessmentofthemostprominentapplicationserversisincludedinSection4.4.1.2.3.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 15 of 42
4.4.1.2.1. Java EE
JavaEEisthestandardenterpriseJavacomputingplatform.Itprovidesawidely-supported,openstandardenablingthedevelopmentofvendorandplatform-agnosticsoftware.JavaEEspecifiesacomprehensivesetoftechnologystandardsaddressingmulti-tiered,scalable,reliableJava applications.Amongthestandards,theE1surveyfocusedonthoserelatedtoenterpriseapplicationdevelopment,including:
EnterpriseJavaBeans(EJB) provideanarchitectureforthedevelopmentanddeploymentofcomponent-basedapplications.
Contexts&DependencyInjection(CDI)/DependencyInjectionforJavaprovide aflexibleapplicationconfigurationmodelminimizingcouplingbetweencomponents.
Interceptors providesupportformanagingAspectOrientedProgramming(AOP) functionssuchaslogging,auditing,andprofiling.
JavaTransactionAPI(JTA) providessupportfortransactionswithintheapplication.
JavaMessagingService(JMS) supportsmessagingbetweenprocessingcomponentsusingreliable,asynchronous,looselycoupledcommunication
JavaEEishighlystable,maturetechnologywithlargeandwellestablisheduserand developmentcommunities.
4.4.1.2.2. Spring Framework
TheSpringframeworkisanopensourceapplicationframeworkfortheJavaplatform.WhereasJavaEEprovidesasetofstandardsforwhichmultipleimplementationsareavailable,theSpringframeworkprovidesa setofconcretetechnologieswith manycapabilities similartothoseavailableinJavaEEimplementations.UnlikeJavaEE, theSpringframeworkisnotbasedonopenstandards.
AmongtheSpringframeworktechnologies,theE1surveyfocusedonthoserelatedtoenterpriseapplicationdevelopment,including:
SpringInversionofControl(IoC)Containers provide anarchitectureforthedevelopmentofcomponent-basedapplications,aswellasaflexibleapplicationconfigurationmodelsimilartoJavaEEEJBsandCDI.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 16 of 42
SpringAspectOrientedProgramming(AOP)/AspectJ,which providesupportformanagingAspectOrientedProgramming(AOP)functionssimilarincapabilitytoJavaEEInterceptors.
Supportforotherrelevantcapabilities addressedbytheJavaEEstandard(e.g.transaction management, messaging) areprovidedthroughintegrationswiththirdpartysolutions, includingJavaEEimplementationsofJTAandJMS.
TheSpringframeworkprovidesanalternativetoJavaEEforthedevelopmentofJava applications.AswithJavaEEimplementations,theSpringframeworkishighlystable,maturetechnologywithlargeandwellestablisheduser&developmentcommunities.
4.4.1.2.3. Application Servers
Applicationserversprovidearuntimeenvironmentfortheexecutionofapplicationsoftware.InthecaseofJava,theapplicationserver actsasanextensionoftheJVM that provides coreapplicationservicessuchassupportforconnectionpooling,clustering,fail-over,andload-balancing.
AnumberofopensourceandcommercialapplicationserversareavailablewhichprovidesupportforJavaEEand/orSpringapplications.InthecaseofJavaEE,theseapplicationsprovideimplementationsoftheJavaEEstandards.InthecaseofSpring,theyprovideanenhancedruntimeenvironmentforapplicationsbuiltontheSpringframework.
ItshouldbenotedthatbothJavaEEandSpringapplicationscanbedevelopedtoexecuteoutsideoftheapplicationserver.ThiscapabilityisprovidedbydefaultaspartoftheSpringframework.JavaEEprovidesanembeddedcontaineraccessiblefrom theJavaStandardEditions(SE)environmentthatimplements asubsetofJavaEEstandards,includingalight-weightEJBimplementation,aswellassupportfortransactions,securityandAOPconcerns(interceptors).
ThemostprevalentapplicationserversprovidingsupportforJavaand/orSpringaredescribedbrieflyinthesectionsbelow.
4.4.1.2.3.1. Wildfly
Wildfly [26,27],formerlyknownasJBossAS, isanapplicationserverdevelopedbytheJBossdivisionofRedHat.WildflyprovidesfullsupportforthelatestJavaEEstandards(JavaEE7),aswellsupportfordeploymentofSpringapplications.Wildflyisfreeand opensourcesoftware. RedHatprovidesoptionalcommercialsupportforJBossEnterprisemiddlewareonasubscriptionbasis.Wildflyisamature,stableandwidelyadoptedsolution [29] thathasbeenunderactivedevelopment since1999.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 17 of 42
4.4.1.2.3.2. GlassFish
GlassFish[14]isanapplicationserveroriginallydevelopedbySunMicrosystemsandcurrently managedbytheOracleCorporation.GlassfishisthereferenceimplementationofJavaEE,andprovidescompletesupportfortheJavaEEstandards.ItalsoprovidessupportfordeploymentofSpringapplications.GlassFishisopensourcesoftware..AcommercialversionknownasOracleGlassFishServerisalsocurrentlyprovidedbyOracle.
In2013,OracleannouncedthatitwilldiscontinuecommercialsupportforGlassFish.Althoughthecommunityversionwill besupportedatleastthroughversion5 (thecurrentversionis 4)andwillserveas thereferenceimplementationthroughatleastJavaEE8(thecurrentversionis 7), thelongertermfutureofGlassFishisuncertain.
4.4.1.2.3.3. Apache Tomcat
ApacheTomcat[15]isawebserverdevelopedbytheApacheSoftwareFoundation(ASF).AlthoughitisoftenconsideredtogetherwithJavaEEapplicationserversowingtoitssupportfortheJavaServletandJSPstandards,itdoesnotprovidefullsupportfortheJavaEEstandards,notablyexcludingsupportforEnterpriseJavaBeans2.TomcatdoesprovidesupportfordeploymentofSpringapplications. Tomcatenjoysalargeuserbaseanddevelopmentcommunity;itisthemostwidelyusedwebapplicationserveronthemarket [29]andpresentsacompellingenvironmentforSpringapplications.Tomcatisfreeandopensourcesoftware.
4.4.1.2.3.4. Jetty
Jetty [16] isawebserverprojectdevelopedaspartoftheEclipseFoundation.AswithTomcat,JettyitisoftenconsideredtogetherwithJavaEEapplicationserversowingtoitssupportfortheJavaServletandJSPstandards;howeveritdoesnotprovidefullsupportfortheJavaEEstandards,notablyexcludingsupportforEnterpriseJavaBeans.ItdoesprovidesupportfordeploymentofSpringapplications.Jettyenjoysalargeuserbase,anditspopularityisontherise[29]. Jettyisfreeandopensource.
4.4.1.2.3.5. WebLogic
WebLogicServer[17]isaproprietaryJavaEEplatformdevelopedbytheOracleCorporation thatprovidesanumberoflarge-scaleenterprisesolutions,includingaJavaEEapplicationserver,webportal,EnterpriseApplication
2 ApacheprovidesaJavaEEcompliantapplicationserverknownasTomEE,whichcombinesApacheTomcatwithadditionalJavaEEsupport,includingOpenEJB,OpenJPAandothers.TomEEisarecentoffering(2012)andhasyettoestablishasignificantusercommunitybeyondthatofTomcat.ItwasnotevaluatedaspartoftheE1survey.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 18 of 42
Integration(EAI)platform,transactionserver(Tuxedo),telecommunicationplatformandwebserver.WebLogic Server isarobust,matureandcomprehensivesolutionforlarge-scaleenterpriseapplications.Aswith IBMWebSphere,itis consideredtobeamorecomplexandheavyweightsolutionthantheotherapplication serverssurveyed [28].
4.4.1.2.3.6. WebSphere
WebSphere[18]isaproprietarysuiteofenterpriseapplicationintegrationmiddlewaredevelopedbyIBM.AtthecenteroftheWebSphereproduct line istheWebSphereApplicationServer,whichprovidesarobust,mature&comprehensivesolutionforlarge-scaleenterpriseapplications.AswithOracle’sWebLogicServer,theWebSphereapplication serverisconsideredtobeamorecomplexandheavy-weightsolutionthantheotherapplicationserverssurveyed[28].
4.4.1.2.3.7. Conclusions
Overall,JavaEEandSpringprovideverysimilarcapabilities forthedevelopmentofJavaapplications.Botharematureandstabletechnologies.SpringismorewidelyusedthanJavaEE;howeverbothenjoyalargeuserbaseanddevelopercommunity[29].
Standardizationpresentsanimportantdistinctionbetweenthetwo.JavaEEisawidely-supported,openstandardenablingthedevelopmentofvendorandplatform-agnosticsoftware.Incontrast,Springisanon-standard,open-sourcesolutiondevelopedandmaintainedbyasinglecommercialvendor.3 SpringhassignificantlyinfluencedtheJavaEEstandard,andisconsideredbysometobeade factostandard.
AlloftheapplicationserversconsideredintheE1surveyaremature,robustproducts.AsshowninFigure2, Tomcatisthemostwidelyusedoftheserversassessed.However,TomcatdoesnotincludesupportforanumberofimportantJavaEEstandards,mostnotablyEJBs.TomcatiscompellingasaSpringdeploymenttechnology.AmongtheapplicationserversprovidingfullJavaEEsupport,Wildfly(formerlyJBossAS)is themostwidelyused,makingitacompellingchoiceforJavaEEapplications.AlthoughGlassFishisthecurrentreferenceimplementationofJavaEE,Oracle’sdiscontinuationofcommercialsupportcallsintoquestionitslong-termviability.
WebSphereandWebLogicareintendedforlarge-scaleenterpriseapplicationdevelopment, andthusarelikelymorecomplexandheavyweightsolutionsthanrequiredforthemodernizedsystem architecture.
3 SpringiscurrentlydevelopedbyPivotal(www.gopivotal.com).
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 19 of 42
Figure 2. Java Application Server Usage Based on 2012 Developer Survey4
4.4.1.3. Enterprise Service Bus
ESBsaregeneralpurposeintegrationsolutionssupportingserviceorientedsystems.ESBsaretypicallyusedinsystemswherethereisaneedtointegrateavarietyofproductsthatwerecreatedindisparateprogramminglanguages,whichrunondifferenttypesofhardware,orwhichareprovidedbyseparateorganizations.Thecommonthemeinthisscenarioistheremaynotbeacommonsystemarchitectureusedwhenimplementingtheindividualservicesandyet thereisaneedandbenefittousingthemtogether.TheESB’sgoalistobridgethegapbetweensuchheterogeneity.ESBsaremostcommonlyusedwithwebservicesandwerestudiedaspartofaSOAproofofconceptprojectcompletedduringInceptionIteration2.MuleESBwasusedinthatprojectandthereforewasnotsurveyedaspartofexploratoryprototyping.Ingeneral,MuleESBprovidessimilarfeaturestothosestudiedintheWS02survey.
4.4.1.3.1. WS02
WS02isaJavaESBbasedonApacheSynapse.ThefirstversionofWS02wasreleasein2007,thoughApacheSynapsehasbeenavailablesince2005.
WS02providesendpoint(i.e.service)failoverandloadbalancingandcanbeintegratedwithaseparateWS02product,theWS02ElasticLoadBalancer,toachieveelasticloadbalancingatthelevelofeithertheESBorofindividualendpoints.Supportforimplementingdataprocessingguaranteesisavailablethroughtransactionalmessagingwhichcanbeusedtoguaranteemessagedelivery.
4 Reproducedfromthearticle“DeveloperProductivityReport2012:JavaTools,Tech,Devs&Data”ByOliverWhite.SeetheReferencessectionforfullcitation.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 20 of 42
WS02facilitatesprocessingtopologydefinitionsthroughatypeofendpointreferredtoasamediationsequence.Whenamediationsequencereceivesamessageitwillexecuteasequenceofothermediatorsinauser-definedorder.Builtinmediatorsareavailableforcommontaskssuch assendingmessagestoservices,aggregatingmessagesreceivedfromservices,providingconditionalmessageroutingtoservices,andprovidingpriority-basedserviceexecution.Custommediatorscanbewrittenbyusersforotherprocessingtasks.
SinceESBsprovideacentralhubfacilitatingserviceintegration,distributedparallelprocessing,supportformultipleprogramminglanguages,securemessaging,andinteroperabilitywithpersistencemechanisms, allarepossibleandsupportedatvariouslevelsbyWS02andotherfullfeaturedESBs.
4.4.1.4. Complex Event Processor Frameworks
4.4.1.4.1. Esper
Esperisareal-timedataprocessingsystemthatworksonstreamsofincomingdata.Esper,likeothercomplexeventprocessors,differentiatesitselffromthestreamprocessors(e.g.Storm,Samza,S4)inthattheprogrammingmodelisbasedon aspecializedquerylanguageenablingselectionof subsetsofdatafromtheincomingstreamforprocessingratherthan on processingallarrivingdatawith afixedtopology.
Esperisopensourceandhasbeenunderdevelopmentsince2004.ItismanagedbyEsperTechandhasseenadoptionbyanumberofprominentcompanies(includingPayPalandRaytheon).EsperTechprovidesanenterpriseversionofEsperthatsupportsclustering.Aseparateproject,EsperHA,provideshighavailabilityandguaranteesnodataloss.
Esperisbuiltarounda powerful,efficient,andexpressivequeryengine.Queriesexecutedagainsttheenginecanbelikenedto databasequeriesoccurringonanon-persistent,transientstreamofincomingdata.Dataselectedforprocessingcanbeinjectedbackintothestreamforexposuretootherselectionqueries.ProcessingtopologiesinEsperwouldlikelybecraftedinclientcodeasaseriesofdataselectionqueries.EventhoughEsper’squeryengineisanalogoustothoseprovidedbydatabases,Esperitselfdoesnotprovideadatapersistencemechanism.
4.4.2. Exploratory Prototyping
Basedontheresultsoftheinitialsurvey,twoofthecandidateswereselectedforfurtherinvestigationandexploratoryprototypingintheE1timeframe:ApacheStormandJavaEE/Wildfly.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 21 of 42
AsdiscussedinSection4.4.2.1,thefeaturesetprovidedbythestreamprocessorcandidatesholdspromisefortheprocessingcontrolrequirementsofthesystem.Stormwasselectedasthemostwell-establishedandwidelyusedofthestreamprocessors.Incontrast,Samzaisa newofferingthathasyettoestablishasignificantuserbaseandtheS4projectappearstobelargelydormant.
ESBcandidateswerenotinvestigatedfurther,giventhatthatthesetechnologies,andMuleESBinparticular, wereassessedaspartoftheSOA proofofconceptprojectcompleted duringInceptionIteration2.
Theeventqueryparadigmcentraltothe complexeventprocessorsolutionswasjudgedtobetoospecializedforthemoregeneralprocessingneedsofthesystem’s automatedandinteractiveanalysisprocessing. Thus,thesecandidateswerenotinvestigatedfurtherinE1.
AJavaEEimplementationwasselectedforfurtherinvestigationratherthantheSpringFrameworkbasedonthestatedpreferenceforsolutionsbuiltonopenstandards.AsdiscussedinSection4.5,further investigationoftheSpringframeworkhasbeenidentifiedaspotentialfollow-onwork. WildflywasselectedfortheJavaEEexploratoryprototypeasthemostwidelyusedapplicationserverprovidingfullJavaEEsupport.
Table2 inAppendixA providesasummarycomparisonoffeaturesetsbetweentheE1prototypecandidatesandexistingsystems.
4.4.2.1. Apache Storm
4.4.2.1.1. Background
ApacheStormisaframeworkthatmanagesreal-time,distributed,fault-tolerant,andmulti-languagedataprocessingapplications.Stormisreal-timeasitismeanttoprocessdataasitbecomesavailableratherthanstoringdataforbatchprocessingatalatertime.Stormisdistributedtosupportprocessinghighdataloadsandtoprovideresiliencetomachinefailures.Stormisfaulttolerantasitcanprovideboth“atleastonce”and“exactlyonce”processingguaranteesforeachpieceofdataitprocesses.Stormismulti-languagesincethefundamentaldefinitionsforprocessingcomponentsandprocessingtopologiesarelanguageindependent.
4.4.2.1.1.1. Processing Model
AStormtopologyisadirectedgraph(eithercyclicoracyclic)whereeachnodeiseitheraspout orabolt.Stormusesspout nodestoemitdataintotopologies,sothesenodeshavezeroincomingedgesandoneormoreoutgoingedgestoothernodesinthetopology.SpoutnodeswilltypicallyinterfacewithfilesorapplicationsexternaltotheStormtopologytoaccessthedatatheyemit.Storm
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 22 of 42
usesbolt nodestoprocessdatawithinatopology.Sincebolts canemittheirresultsintothetopologyforfurtherprocessing,bolts haveoneormoreincomingedgesandzeroormoreoutgoingedgestoothernodesinthetopology.Stormpassesdatatuples alongtheedgesbetweennodes.Tuples cancontainanynumberofdataelementsandtherearenolimitationsondatatypesexceptthatalldatatypesmusthaveaserializationsothatStormcanpassdatabetweenprocessesandmachines.
Stormexploitstwotypesofparallelprocessing.First,multiplespouts andboltsinatopologycansimultaneouslyprocessdifferenttuples.Second,Stormcanreplicatespouts andbolts tomultipleprocessingtasks.Thisallowssimultaneousexecutionofthesameunderlyingspout orbolt codeondifferentdatatuples.
4.4.2.1.1.2. Clustering
StormtopologiesrunonStormclusters.Stormclusterscontainamasternodeandoneormoreworkernodes.ThemasternoderunsStorm’sNimbus daemonwhichisresponsiblefordistributingthespout andbolt code usedwithintopologiestotheworkernodes,assignstaskstoworkers,andmonitorstheworkernodesforfailures.EachworkernoderunstheSupervisor daemonwhichreceivesdataarrivingatthemachineandassignsittoworkerprocesses.EachworkerprocesscontainsoneormoreExecutor threads,andeachExecutor runsoneormoretasks.Atask iseitheraspout orabolt.TheSupervisor isalsoresponsibleforstartingandstoppingworkerprocessesonthemachine.
Storm hasacustomclustermanagerthatdetermineswhichworkernodesperformwhichprocessingtasks.Asdiscussedinsection4.4.2.1.3.3,Stormsupportssomeuserspecifiedgroupingsthatallowclientstoindicatewhereacomponentshouldrunrelativetoothercomponents.Figure3 showsasimpleStormtopologywithonespoutandtwobolts runningonaclusterwithfourworkernodes.EachworkerprocesshasseveralExecutorthreadswhichrunprocessingtasks.Thetasksarecolorcodedaccordingtowhichspoutor bolt theyrun.Stormassignsthework ofeachspoutandbolttooneoftheavailabletasksasdatabecomesavailableforprocessing.Storm’sabilitytodistributeatopology’sworkinthismannerishowitprovideshorizontalscalability.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 23 of 42
Figure 3. Mapping Storm processing topologies to clusters5
StormNimbus andSupervisor daemonsuseZookeeperandthelocalfilesystemoneachclusternodetostoreprocessingstates.Zookeeperisusedtocoordinatetheruntimeclusterconfigurationandstateparametersrequiredtodeterminehowtopologiesaredistributedacrossmachinesandhowthecurrentworkisdistributedacrosstheworkernodes.Stormusesthelocalfilesystemtostorelargerchunksofdistributeddata,suchastopologydefinitionfiles.Storingprocessing stateoutsideofStormallowstheStormdaemonsandStormworkerstoindependentlyfailandrestartwithoutcausingfailurestotherunningtopologies.Zookeeperalsorunsonacluster,providingalayerofdataresilienceandavailabilityinthepresenceofmachinefailures.Stormdoesnotexposethesepersistencemechanismstoclients,soclientcodesareresponsibleforhandlingtheirdatamanagementandpersistenceneedsindependentofStorm.
4.4.2.1.1.3. Fault Tolerance
Stormimplementsareliabilitymodelthatcanguaranteealldataemittedfromaspoutisprocessedbyatopology.However,ifStormisusedinabroadersystemwhereaspoutisaconsumeraccessingdatafromanexternalproducer,Stormcannotguaranteethatalldatacreatedbytheproducerareconsumedbythespoutorthatalldataconsumedbythespout areemittedintothetopology.ThesekindsofprocessingguaranteesareattainableiftheStormspout hasdurableaccesstoexternaldata.ThesetypesofprocessingguaranteesareavailableintheexploratoryprototypethroughuseofaStormspoutthat
5 AdaptedfromMichaelNoll’sUnderstandingtheParallelismofaStormTopology (http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/) [31].
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 24 of 42
consumesdatafromtheApacheKafkafaulttolerantproducer-consumermessagingframework.
4.4.2.1.1.4. Process Monitoring
Stormprovidesseveralprocessmonitoringanddiagnostictools.Commandlinetoolsareusedtostart,stop,rebalance,andperformsimpleintrospectionontheStormclusterandthetopologiesitisrunning.Stormalsohasawebclientprovidinginformationonclusterresourceavailability,consumption,anduptimes;topologyresourceconsumptionanduptimes;andworkernodeuptimes.Additionally,allStormdaemonsandworkerprocesseswritedetailedlogfiles,includingstatusupdatesandexceptionstacktraces,toacommondirectory.
4.4.2.1.2. Cluster Configurations
TheApacheStormexploratoryprototypingeffortinvolvedconfiguringatwonodeStormclustercapableofrunningavarietyofStormprocessingtopologies.Onenodeinthetwonodeclusterwasconfiguredasbothamasterandaworkerwhilethesecondnodewasexclusivelyaworkernode.ThemasternodethereforeranStormNimbus,StormSupervisor,Zookeeper,andKafkawhiletheworkernodeonlyranStormSupervisor.AnoperationalclusterwouldlikelyinvolvemultiplenodesformingaZookeepercluster,multiplenodesformingaKafkacluster(ifKafkawasused),andadditionalSupervisor nodes.StormNimbus cannotbedistributed,soitmustrunonasinglenoderegardlessofclustersize.
AddinganewworkernodetoaStormclusterisastraightforwardprocessthatminimallyinvolvesinstallingStorm,installingStorm’smessaginglibrary,andeditingaStormconfigurationfilewithreferencestotheZookeeperandNimbusmachines.SinceStorm isdesignedtoimmediatelyclosetheJVM,andthereforeclosetheSupervisor,wheneveranexceptionoccursitisalsorecommendedtorunStormunderprocesssupervision.Theprototypemachineswereconfiguredtousetheprocesssupervisornamedsupervisor.ProcesssupervisionwasconfiguredtostarttheStormSupervisor ontheworkernodeandStormNimbus,StormSupervisor,Kafka,andZookeeperonthemasternode.SinceStormcanrestartworkertasksafterfailureandtheprocesssupervisorcanrestart Stormafteritfails,runningStormunderprocesssupervisionprovidesadurableprocessingcluster.Aconvenientsideeffectofprocesssupervisionisallprocessesrequiredtorunthemasterandworkernodesarestartedwhentheprocesssupervisorisstarted,soonlyonesteppermachineisrequiredtostartthecluster.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 25 of 42
4.4.2.1.3. Topology Configurations
4.4.2.1.3.1. Prototyped spouts and bolts
Eachprocessingtopologycreatedforthisprototypeconsistedofthreetasks:onespout task producingwaveformtuples,onebolttaskconsumingwaveformtuplesandproducingsignaldetectiontuples,andonebolttaskconsumingsignaldetectiontuples andproducingeventhypothesistuples.Allofthedataproducedinthesetaskswasrandomlygenerated.Thetasksdidnotrunactualalgorithmstodetectsignalsorbuildeventhypotheses.Figure4 liststheJavacodeusedtodefinethese topologies.ThelistingshowshoweachStormcomponentisassignedatextualidentifierthatisthenusedtolinkittoothercomponents(e.g.the“detections”boltreceivedinputfromthe“waveforms”spout.ThenumbersattheendofeachsetSpout() and setBolt() callindicatethenumberoftasksrunningtheassociated component.
Figure 4. Sample Storm code to define a processing topology.
Twotypesofwaveformspouts werecreated.ThefirstisastandaloneJavaimplementation thatgeneratesandemits randomwaveforms.ThesecondisaJavaimplementationthatconsumeswaveformsfromKafkaandemits themintothetopology.Thisspout worksinconjunctionwithaJavaapplicationthatgeneratesrandomwaveformsandpublishesthemtoKafka.
Twosignaldetectionbolts werecreatedthatconsumewaveformsandproducesignaldetections.ThefirstisaJavaimplementationthatemitsrandomsignaldetectionsontheconsumedwaveforms.ThesecondisaC++implementationperformingthesamefunction.
Oneeventformationboltwascreatedthatconsumessignaldetectionsandproduceseventhypotheses.Thisbolt hasaJavaimplementation.
FourtypesofStormtopologiesweredefinedusingthesespouts andbolts:
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 26 of 42
Table 1: Topology Definitions
Waveformspout
SignalDetectionbolt
EventFormationbolt
TopologyA Standalone Java Java
TopologyB Kafka Java Java
TopologyC Standalone C++ Java
TopologyD Kafka C++ Java
4.4.2.1.3.2. Specifying Processing Guarantees
EachofthetopologiesinTable1 wasdefinedusingtwodifferentreliabilitymodels.Thefirstgroupoftopologiesusedthedefault“atmostonce”modelwhichguaranteesthatalldataemittedintoatopologyisprocessednomorethanonetime,butdoesnotguaranteethatalldataemittedintoatopologyisprocessed.Thesecondgroupoftopologiesusedthe“atleastonce”modelwhichguaranteesthatalldataemittedintoatopologyisprocessedatleastonetime,butdoesnotguaranteethatthedataisonlyprocessedonetime.Neitherofthesemodelsguaranteesanyparticulardataprocessingorder,soStormmayprocessatupleemittedintoatopologyeitherbeforeoraftersubsequenttuplesemittedintothetopology.WhentheKafkawaveformspout isusedthesystemcanadditionallyguaranteethatallofthewaveformspublishedtoKafkathroughtheexternalprogramareconsumedbytheKafkawaveformspout andemittedtothetopology,atwhichpointtheStormreliabilitymodeldictateswhetherornotthewaveformtuples areguaranteedtogetprocessedbythetopology.
Configuringatopologytousethe“atleastonce”reliabilitymodelwasatrivialchangefromthe“atmostonce”modelthatonlyrequiredtaggingeachtuplewithanidentifierasitwasemittedtothetopologyandsubsequentlyacknowledgingeachtimeaboltcompletesprocessingforatuple.Stormusesthisinformationtokeeptrackofthepotentiallyexpansivetuplegraphstemmingfromeachtupleemittedbyaspout andwillreemitatuplewhenitdeterminesprocessinghasfailed.Stormusestimeoutstorecognizeunresponsivenodes,workers,ortasksandautomaticallyfailsprocessingfortheirassignedtuples.Stormuserscanalsoexplicitlyfailtupleswithintheirboltimplementations.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 27 of 42
Stormcanalsoprovidethestrongerguaranteeof“exactlyonce”processingwhichguaranteeseachtupleemittedintoatopologyisprocessedbythetopologyasingletime,butdoesnotguaranteethattuplesaresuccessfullyprocessedinthesameordertheywereemitted.Stormtopologieswith“exactlyonce”processingsemanticsaredefinedusingStorm’sTridentAPI.ThesesemanticsalsorequireasecondarydatabasetostorethestateStormusestotracktupleprocessing.ImplementingTridenttopologiesisthereforesignificantlydifferentthanconfiguringnormalStormtopologiesandwasnotexploredinthisprototype.
Stormtopologydefinitionsallowspecifyingthenumberofworkerprocessesrunningatopology,theinitialnumberofexecutorthreadsforaparticularspoutorbolt,andthetotalnumberoftasksallocatedforthatspout orbolt.Sincetopologydefinitionsarestatictheseparametersplaceanupperboundontheclusterresourcesconsumedbyanexecutingtopology.Multipletopologiescanrunsimultaneouslyonthesamecluster,butStormdoesnotprovideameansforuserstocontrolprovisioningspecificworkernodestospecifictopologies.Stormdoesprovideacommandlinetooltodynamicallychangetheparallelismofarunningtopologybysettingthenumberofworkersavailabletothetopologyandthenumberofexecutorsassignedtoeachspout orbolt.Stormprovidesnobuiltinmeanstoautomaticallyrebalancetopologies,andthereforedoesnotprovidedynamicprocessingelasticity.
4.4.2.1.3.3. Assigning tuples to Processing Tasks
ThoughStormdoesnotallowspecifyingwhichnodesrunwhichStormcomponents,itdoesallowclientstospecifyhowthedataprocessedbyeachcomponentisdistributedacrossthetasksrunningthatcomponent.Clientsassignatextualnametoeachspout andbolt inthetopologydefinition.Thesenamesareusedtospecifythefundamentaldataflow(e.g.thewaveformspoutprovidesinputthesignaldetectionboltwhichprovidesinputtotheeventhypothesisbolt).ThenamesarealsousedtoinformStormofhowthetopologyexpectsdatatoberoutedacrosstheindividualtasksrunningeachcomponent(e.g.groupdataoutputfromthewaveformspout suchthatwaveformdatafromastationisalwaysprocessedbythesamesignaldetectionbolttask).Aclientcanspecifyvarioustypesofgroupings,includingthatalltupleswiththesamevalueforaparticularfieldberoutedtothesametask,thattuplesbe distributedrandomlybutevenlyacrosstasks,thattuplesberoutedtoalltasks,andsoon.
4.4.2.1.4. Type System
Stormhasaweaktypesystem.Stormcomponentsdeclarebyname(notbytype)whichfieldsitwillemit,butdoesnotspecifywhichfieldsitexpectsasinput.ThisallowsStormcomponentstobewiredtogetherinarbitrarywaysandallowscomponentstoacceptdifferenttypesoftuplesasinput.Clientcodesthatcreatetopologiesareresponsibleforconnectingcomponentsinsensibleways.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 28 of 42
Storm’sJavaAPIsprovideruntimetypecheckingthroughthetupleaccessormethods.
4.4.2.1.5. Multilanguage Support
StormtopologiesareimplementedasApacheThriftstructures.SinceThrifthasbindingsformostcommonprogramminglanguages(includingJava,C++,C#,andPython)itispossibletowriteStormtopologiesusingavarietyofprogramminglanguages.SinceStormruns ontheJavaVirtualMachineandprovidesaJavaAPIforbuildingtopologies,itisstraightforwardtowriteStormcodeinJavaandotherJVMlanguages.Inpractice,definingtopologiesinnon-JVMlanguageswilllikelyinvolveusingatopologybuildingAPIwritteninthatlanguagetoprovideanabstractionovertheThriftsyntax.Forexample,thePetrelopensourceprojectcanbeusedtocreateStormtopologiesinPython.
Stormsupportsspouts andbolts writteninnon-JVMlanguagesthroughthemultilang protocol.StormusesJavawrapperstocommunicateJSONmessagesoverstdinandstdoutwithprocesseswritteninnon-JVMlanguages.Stormdefinesthecommunicationprotocolandmessagingformat,whichmustalsobeimplementedinthenon-JVMlanguage.TheJavawrappersdefinethespoutsandboltsforconfiguringintopologiesandimplementtheJavahalfofthemultilangprotocol.TheC++signaldetectionboltwrittenfortheexploratoryprototypeusesthemultilangprotocolbyleveragingtheopensource StormCPPimplementation.OnedisadvantageforUnixorLinuxbasedsystemisexecutablefilesdistributedbyNimbus toclusternodes lose theirexecutablestatus.TheeffectofthisisthatspoutsandboltscompiledasexecutableprogramseitherrequireworkaroundstobemadeexecutableormustbedistributedtotheclusternodesindependentofStorm.
AsecondapproachtoaccessingCorC++codesfromStormtopologiesistousespouts orbolts accessingexternalexecutablecodethroughthe JavaNativeInterface.ThisapproachcircumventsStorm’smultilangprotocolbyplacingmessagingresponsibilityontheimplementation.However,itdoesnotrequireanymoreJava spoutorboltimplementationsthanthemultilangprotocolandhastheadvantagesofavoidingJSONserializationandthepotentialtouselocalmemorywhenpassingparametersbetweentheJavaandnativecodes.
4.4.2.1.6. Serialization and Messaging
StormusesKryotoserializeanddeserializetuplestoandfrommessages,butdefaultstousingthelessefficientJavaserializationifaKryoserializationcannotbeperformedforanobject.ClientcodescandefinespecificKryoserializerstouseonaclassbyclassbasis.Storm’smessagingframeworkispluggablewithexistingimplementationsinZeroMQandNetty.Stormdoesnotguaranteereliablemessagingbetweennodes.However,asdiscussedpreviously,Storm’s
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 29 of 42
reliabilitymodelcanguaranteethatalldataemittedintothetopologyisprocessed,evenifindividualmessagesfail.
4.4.2.1.7. Summary
Stormprovidesaclustermanagementanddistributedprocessingcontrolmechanism.Topologydefinitionsandresourceallocationsarestaticallyconfigured,soalldataflowsandparallelismareconfiguredbeforerunningthetopology.Dynamicbutmanualinterventionisrequiredtoupdatearunningtopology’sresourceallocation.Stormnativelysupportsprocessingcomponentsbuiltinavarietyoflanguages,butJVMlanguagesseemeasiertouseanditisnotcleartheJSONmultilangprotocolprovidesmuchadvantageoverJNIwhenaccessingnativecodes.Stormcanenforceavarietyofprocessingguarantees.Asastreamprocessor,itseemsmostsuitedtocontinuouslyrunningprocessingtasksthathavelittleornocriticalpathrelianceonexternalprocessing.
Stormhasgeneratedinterestintheopensourcecommunityandhasover8,000GitHubstars.WhileStormprovidessomedocumentation,muchofthesupportrequiredtosetup,configure,andruntheprototypeclusterandtopologycamefromonlineusergroupsandthirdpartytutorials.StormsupportmayevolvenowthatitisanApachemanagedproduct.
4.4.2.2. Java EE/Wildfly 8
4.4.2.2.1. Background
4.4.2.2.1.1. Containers & EJBs
JavaEEprovides amodularcomponentframeworkforcreatingrobust,distributedapplications.Attheheartofthisframework istheconceptofacontainer,whichprovidesfoundationalservicesthatotherwisewouldneedtobeimplementedwithintheapplicationsoftware(e.g.transactionhandling,statemanagement,multi-threading,resourcepooling).Thecontaineralsomanagesthelifecycleofthecontainedapplicationcomponentandsupportsdependencyinjection.JavaEEspecifiesmultiplecontainertypesbasedonitsmulti-tierarchitecture,includingApplicationClientContainers,WebContainers,andEnterpriseJavaBean(EJB)Containers.
JavaEEspecifiestwotypesofEJBsthatcanexecutewithintheEJBcontainer:
1. SessionBeans encapsulateprocessinglogicthatisinvokedfromclientviews(local,remote,webservices)aspartofaclientsession.Threetypesofsession beansaredefined.
StatefulSessionBeans interactwithasingleclientandmaintainstateinformationacrossclientcallsforthedurationofasession.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 30 of 42
StatelessSessionBeans caninteractwithmultipleclients,butdonotmaintainconversationalstateacrossclientinvocations.
SingletonSessionBeans areinstantiatedonceperapplication,existforthelifecycleoftheapplication,andprovideconcurrent,sharedaccessacrossmultipleclients.
2. MessageDrivenBeans (MDBs)supportasynchronousprocessing ofmessages,actingasJMSlisteners.MDBsencapsulatelogicthatisexecutedaspartofamessage-drivenarchitecture.
TheE1exploratoryprototypefocusedonanautomated processingarchitecturecomposedofMessageDrivenBeansexecutingwithinasetof EJBcontainers.
4.4.2.2.1.2. Dependency Injection
DependencyInjectionisacentralfeatureofJavaEEenablingflexibleconfigurationoftheapplicationandminimalcouplingbetweencomponents.Throughdependencyinjection,thedependenciesbetweencomponentsareassigneddynamicallyatruntime,minimizingtheneedforexplicitstaticcoupling,andallowingforalternativedependenciesbasedonthedeploymentcontextoftheapplication(e.g.allowingmockversionsofdependentcomponentswithintestenvironments).DependencyinjectioncanbespecifiedusingeitherXMLconfigurationor(morecommonly)throughtheuseofJavaannotationswithinthesourcecode. TheE1exploratoryprototypeincorporateddependencyinjectionannotationstospecifymockseismicprocessingcomponentimplementations andJMSmessagetopicsforcommunicationamongcomponents.
4.4.2.2.1.3. Wildfly Server Configuration
TheWildflyservercanbeconfiguredtorunononeoftwomodes:standaloneanddomain.Instandalonemode,eachinstanceoftheWildflyserver isconfigured,deployedandmanagedindependently.Indomainmode,asetofWildflyserversisconfigured,deployedandmanagedasagroup,knownasamanageddomain.AlthoughbothmodessupportHAclustering,manageddomainssimplifyclusteredservermanagementbymaintainingaconsistentconfigurationandcoordinatingupdatesacrossthecluster.AsshowninFigure5,withinadomain,theserverclusterismanagedbyadesignated DomainController,whichcoordinatestheotherserversbywayofaHostControllerinstancerunningoneach host.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 31 of 42
Figure 5. Wildfly Clustering Using Managed Domains6
ConfigurationofthemanageddomainisdefinedusinganXMLfile:domain.xml.Amongotherthings,thisfilecontainstheclusterdefinitionandconfigurationsettingsdefinedaspartofa HAprofile.
WildflyprovidesHAservicesthroughtwofeatures:
1. Failover – clientsinteractingwithaserverinstancewillnotbeinterrupted,evenifthenodeonwhichtheinstanceisexecutingfails.Wildflysupportsfailoverbyprovidingdistributedstorageofthestateinformationneededtorestoreprocesses,andbyrelocatingprocessingacrosstheclusteruponfailureoftheunderlying processor node.
2. LoadBalancing – clientrequestsaredistributedacrosstheavailablenodesoftheclustertomaintaintimelyresponseinthepresenceofhighrequestvolume.
4.4.2.2.1.4. Messaging
JavaEEincludestheJMSstandardtoprovideformessage-basedcommunicationbetweenprocessingcomponentswithinadistributedapplication.JMSprovidesmultiplemessagingpatterns,includingpoint-to-pointmessagingviamessagequeues,aswellaspublish-subscribemessagingviamessagetopics.Although
6 Reproducedfromthe“WildflyAdminGuide”ByHeikoBraun.SeetheReferencessectionforfullcitation.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 32 of 42
Wildflysupportsintegrationwithanumberofmessageorientedmiddlewaretechnologies,HornetQisprovidedasthedefaultsolution.TheE1prototypeusedHornetQpublish/subscribetopicsforcommunicationamongcomponents.
HornetQincludesanumberofHighAvailability(HA)facilitiestoprovideatleastonce oronceandonlyonce messagedeliveryguarantees.Inordertosupportatleastonce deliveryguarantees,HornetQ includessharedpersistenceofmessagedataacrossnodeswithinanHAcluster.Twoalternateapproachesareprovided:
MessageReplication – Messagesarereplicatedonmultiplenodeswithintheclusterandaresynchronizedcontinuouslyacrossthenetwork.Uponfailoverofthedesignatedlive server,thedesignatedbackup serverretrievesfromitslocallyreplicatedmessagesinordertoresumemessageprocessing.
SharedStore – MessagesarepersistedinasharedstorageareaaccessiblefromthenodesoftheHAcluster- typically,astorageareanetwork(SAN).Uponfailoverofthedesignatedlive server,thedesignatedbackup serverretrievesmessagesfromthesharedstorageareainordertoresumemessageprocessing.
Inorder tosupportonceandonlyonce deliveryguarantees,HornetQsupportstransactionalprocessing,whereanyin-progresstransactionsonthefailednodearerolledbackandrestartedusingpersistedmessagesonthebackupnode.HornetQalsoprovidesaduplicatemessagedetectioncapabilitytopreventrepeatedmessageprocessingduringfailoverfornon-transactionaldeployments.
4.4.2.2.2. Exploratory Prototype
FortheE1exploratoryprototype,anHAclusterwasdefinedacross2VirtualMachine(VM)hostsusingamanageddomain.EachVMwasconfiguredtohostaninstanceoftheWildflyServer runningamockseismicprocessingpipeline.ThemockseismicpipelinewasdefinedasasetofMessageDrivenBeanscommunicatingmockwaveformandsignaldetectiondataviaasetofHornetQpublish/subscribeJMStopics. ThepurposeofthisprototypewastofurtherinvestigatetheprimaryfeaturesofaprocessingarchitecturebuiltusingEJB,CDIand JMSwithinaWildflyapplicationservercluster. Findingsbasedon theprototypearediscussed inthesectionsbelow.
4.4.2.2.2.1. Wildfly Server Management
Definition&deploymentoftheWildflyManagedDomainHAclusterwasrelativelystraightforwardgiventhesoliddocumentationandquickstartexamplesavailablefromtheJBosscommunity. WildflyprovidesasophisticatedCommandLineInterface(CLI)foradministrationoftheruntimeenvironment,however, fortheE1prototype,onlyasmallsetofbasiccommandswere needed
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 33 of 42
todeployandmanagetheapplication.AJBoss Eclipsepluginisavailable,whichsupportsexecutionoftheWildflyserveranddeploymentoftheJavaEEapplicationviaMavenfromwithintheEclipseIDE.Wildflyalsoprovidesasecure,web-basedadministrationinterfacesupportingconfigurationandmonitoring.
4.4.2.2.2.2. Mock Seismic Pipeline
ThemockseismicpipelinewasdefinedasacollectionofprocessingcomponentsencapsulatedwithinMessageDrivenBeans.ThecomponentsthemselveswereimplementedasPlainOldJavaObjects(POJOs),whichwereinjectedintotheirrespectiveMDBsatruntimeusingCDI.Dependingontheirrolewithinthepipeline,componentsconsumedmockwaveformand/orsignaldetectiondata,andproducedeithermockwaveformdata,signaldetections, and/oreventhypotheses.Themockinputsand outputsweretransmittedbetweenprocessingcomponentsasynchronouslyusingJMStopicsinjectedintotheMDBsatruntimeusingCDI.Forconvenience,aservletwasdefinedtoinjectmockwaveformdataintothefrontendofthepipelinebywayofa clientwebpage.Figure6 depictsthemockpipelineprocessingcomponents,dataproviderservlet, andJMSmessagepaths.
AlthoughitwasnotimplementedaspartoftheE1pipeline,theprototypewasdesignedinsuchawaythatthe processingcomponentscould beinstantiatedseparatelywithinSessionBeansaspartofamockinteractiveanalysisinterface.
DevelopmentoftheMDBs,JMSinterfacesandprocessingcomponentswasstraightforwardandbenefittedsignificantlyfromawealthofsoliddocumentationandquick-startexamplecodeavailablefromtheJBosscommunity.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 34 of 42
Figure 6. E1 Prototype Mock Seismic Pipeline
4.4.2.2.2.3. Limitations
AsignificantlimitationofJavaEEthatwasdiscoveredaspartoftheevaluationisthelackofsupportforexecutionofnativesoftwarewithin JavaEEapplications.TheEJBstandardprohibitsloadingof nativelibraries(e.g.viaJavaNativeInterfacesorJavaNativeAccess)fromwithintheEJBcontainer.Similarly,thestandardprohibits theexplicitexecutionofthreadswithintheEJB-managedapplications.These restrictionsinthestandard are intendedtopreservethesecurity,stabilityandportabilityofJavaEEapplications;however,theyeliminatetheabilityoftheapplicationtodirectlyinvokeprocessingcomponentsimplementedinnon-JVMlanguagessuchasCand C++.
OneoptionforovercomingthislimitationistousewebservicestocommunicatebetweentheJavaEEapplicationsandnon-JVMcomponentsofthesystem.ThissolutionwasnotincludedaspartoftheE1prototypeandshouldbeaddressedinfollow-onworkinordertoassesstheviabilityofaJavaEE-basedarchitectureformixed-languageapplicationspendingadecisionregardingthescaleandusagepatternofnon-JVMsoftwarewithinthemodernizedsystem architecture.
4.4.2.2.2.4. Conclusions
AnarchitecturebuiltfromtheJavaEEtechnologiesusedintheE1prototypeprovidesaflexible,robust,fault-tolerantdistributedprocessingcapabilitythatiswellsuitedtodevelopmentofaJava-basedautomatedprocessingpipeline.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 35 of 42
HarnessingadditionalJavaEEtechnologiessuchasSessionBeanswilllikelyprovide similarlypowerfulcapabilities forthedevelopmentofinteractiveanalysisprocessingcomponents.
TheWildflyserverprovidesastable,secureanduserfriendlyruntimeenvironmentenablingdeploymentandmanagementofhighly-availableJavaEEprocessingapplications.
TogethertheJavaEEAPIsandWildflyserveradministrationtools,documentationandexamples enable highlyefficientapplicationdevelopment.
AsignificantlimitationoftheJavaEEarchitectureisthelackofsupportforexecutionofnon-JVMsoftwarecomponents. Solutionsforthisusecaseshouldbeevaluated infollow-onworkinordertounderstandtheviabilityofJavaEEtechnologyinthemodernizedsystem architecture.
4.5. Follow-On Work
Four areasoffollow-onworkhavebeenidentified foriterationE2 andbeyondbasedontheE1prototypingactivities:
1. ExploreadditionalPCFsolutions,basedoninputfromthecommunity
2. AssesscustomPCFsolutions
3. Developabasicprocessingpipelineprototype
4. SelectaPCFsolutionfor theexecutablearchitectureprototype
4.5.1. Explore additional PCF solutions
IniterationE2, theprototypingteamwillsolicitandincorporatecandidatesolutionsfromthecommunity.Theintentistoleveragedesignandprototypingknowledgeavailablefromapplicabledevelopmentorganizationsandtoensurethatinformeddecisionsaremadeinthedevelopmentofaprocessingcontrolframeworkconsideringabreadthofcandidatesolutions.
Emphasiswillbeplacedonexistingdesignpatternsandassociated solutionsthatprovideflexible,multi-languageprocessingsupportwithminimalcouplingbetweencomponents.
4.5.2. Assess custom PCF solutions
IniterationE2,theprototypingteamwillassesscustomsolutionsmeetingtherequirementsof theprocessingcontrolframework.Thepurposeofthiseffortistodeterminewhetherasolutionincorporatingmoresubstantialcustom
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 36 of 42
software will bettermeettheneedsofthesystemthantheCOTS-basedsolutionsexplorediniterationE1.
4.5.3. Select a PCF solution for the executable architecture prototype
BasedontheE1andE2exploratoryprototyping,theteamwillselectasolutionforuseindevelopingthePCF elementoftheexecutablearchitectureprototype.ThisselectionwillbecompletedbytheendoftheE2iteration,inordertoallocate sufficient timeforiterativedevelopmentoftheexecutablearchitectureprototypeiniterationsE3-E4.
4.5.4. Develop a Basic Processing Pipeline Prototype
BeginninginE2,theprototypingteamwillassemblearepresentativesetofseismicprocessingcomponentsbasedonsoftwarebaselinesdrawnfromGNEM-relatedresearchprojectsandpossiblyfromexistingUSNDCand IDCapplications.Thesecomponentswillprovidemorerepresentative processingsoftwareforuseinfurtherevaluation ofcandidatePCFsolutions,performancebenchmarking,anddevelopmentoftheexecutablearchitectureprototype.
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 37 of 42
APPENDIX A. COMPARISON OF PROTOTYPE AND EXISTING SYSTEM PROCESSING CONTROL FRAMEWORKS
Table 2: Feature sets of E1 Prototype Candidates and Current Systems
Feature
E1 Prototype Candidates Existing Systems
Apache Storm Java EE/Wildfly US NDC (Current)
IDC (Current) NEIC (Current) SeisComp3
Processing Deployment & Execution
Apache Storm / Apache Zookeeper
Wildfly Custom Custom Custom Custom
Processing Definition
Apache Storm, Apache Thrift
Wildfly Custom Custom Custom Custom
HA ClusteringApache Storm / Apache Zookeeper
Wildfly mod_cluster N/A N/A N/A N/A
Transaction Management
Trident Wildfly JTA N/A N/A N/A N/A
MessagingMultiple: ZeroMQ, Netty, Kafka
Multiple JMS providers (default: HornetQ JMS)
Oracle Advanced Queuing
Tuxedo Custom Spread
PROCESSING CONTROL FRAMEWORK DECEMBER 2014
SAND-xxxx Page 38 of 42
Table 3: E1 Survey Results Summary
Category Candidate Solution
Summary Assessment
Enterprise Java
Application Frameworks
Java EE
Advantages: Widely-used open standards with large development community. Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures.
Disadvantages: EJB standard prohibits use of native libraries and direct thread creation, limiting design options supporting non-JVM languages.
Spring Framework
Advantages: Widely-used open-source solution with large development community. Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures.
Disadvantages: Not standards-based.
Stream Processors
Apache Storm
Advantages: Open-source solution with significant industry interest. Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures. Supports multiple development languages.
Disadvantages: New offering. Not standards-based.
Apache Samza
Advantages: Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures.
Disadvantages: New offering that has yet to establish significant industry interest. Not standards-based. Does not support multiple languages(Java only).
Apache S4
Advantages: Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures. Supports multiple development languages.
Disadvantages: Little industry interest and development activity. Not standards-based.
Enterprise Service Bus
WS02 ESB
Advantages: Provides a robust platform for integration of heterogeneous systems via standardized messaging as part of a service-oriented architecture.
Disadvantages: Design strengths not well aligned to the end-state modernized architecture (US NDC and IDC are not heterogeneous system of systems).
Complex Event
Processor Esper
Advantages: Provides a robust platform for development of scalable, fault-tolerant, distributed processing architectures.
Disadvantages: Specialized, query-based architecture does not fit system processing needs particularly well. Not standards-based. Does not support multiple languages (Java only).
APPENDIX A. COMPARISON OF PROTOTYPE AND EXISTING SYSTEM PROCESSING CONTROL FRAMEWORKS
Page 39 of 42
REFERENCES
1. Storm:DistributedandFaultTolerantRealtimeComputation,TheApacheSoftwareFoundation,(http://storm.incubator.apache.org/).
2. S4:Distributed StreamComputingPlatform,TheApacheSoftwareFoundation,(http://incubator.apache.org/s4/).
3. Samza,TheApacheSoftwareFoundation,(http://samza.incubator.apache.org/).
4. Esper,EsperTech Inc.,(http://www.espertech.com/products/esper.php).
5. WS02EnterpriseServiceBus,WS02,(http://wso2.com/products/enterprise-service-bus/).
6. MuleESB, MuleSoftInc.,(http://www.mulesoft.org).
7. Kafka:AHigh-ThroughputDistributedMessagingSystem,TheApacheSoftwareFoundation,(http://kafka.apache.org/).
8. ApacheHadoopNextGenMapReduce(YARN),TheApacheSoftwareFoundation(http://hadoop.apache.org/docs/r2.3.0/hadoop-yarn/hadoop-yarn-site/YARN.html).
9. Zookeeper,TheApacheSoftwareFoundation,(http://zookeeper.apache.org/).
10. ZeroMQ,iMatixCorporation,(http://zeromq.org/).
11. Netty,TheNettyProject,(http://netty.io/).
12. ApacheThrift,TheApacheSoftwareFoundation,(http://thrift.apache.org/).
13. ApacheHelix,TheApacheSoftwareFoundation,(http://helix.apache.org/).
14. OracleGlassfish,OracleCorporation,(http://www.oracle.com/us/products/middleware/cloud-app-foundation/glassfish-server/).
15. ApacheTomcat,TheApacheSoftwareFoundation,(http://tomcat.apache.org/).
16. Jetty,TheEclipseFoundation,(http://www.eclipse.org/jetty/).
17. WebLogic,OracleCorporation,(http:www.oracle.com/WebLogic).
18. WebSphere,IBM,(www.ibm.com/software/websphere/).
19. Petrel,(https://github.com/AirSage/Petrel)
APPENDIX A. COMPARISON OF PROTOTYPE AND EXISTING SYSTEM PROCESSING CONTROL FRAMEWORKS
Page 40 of 42
20. C++WrapperforStorm,2012,(http://demeter.inf.ed.ac.uk/cross/stormcpp.html).
21. Storm-Kafka, (https://github.com/nathanmarz/storm-contrib).
22. Noll,Michael,RunningaMulti-NodeStormCluster,2013(http://www.michael-noll.com/tutorials/running-multi-node-storm-cluster/).
23. Hamlet,BenjaminR.,et.al.,USNDCModernization:ServiceOrientedArchitectureProofofConcept,SandiaNationalLaboratories,Albuquerque,NM87185,December2014.
24. Jendrock,Eric,Cervera-Navarro,Ricardo,Evans,Ian,Haase,Kim,Markito,William,Srivathsa,Chinmayee.“JavaEE7Tutorial.” OracleJavaEEDocumentation.Oracle,September2013.Web.http://docs.oracle.com/javaee/7/tutorial/doc/home.htm.
25. Wheeler,Willie&White,Joshua,SpringinPractice,ManningPublications,2013.Print.
26. Braun,Heiko.“WildflyAdminGuide.”Wildfly8Documentation.JBoss,Jan22,2014.Web.https://docs.jboss.org/author/display/WFLY8/Admin+Guide
27. Braun,Heiko.“DeveloperGuide.”Wildfly8Documentation.JBoss,Jan22,2014.Web.https://docs.jboss.org/author/display/WFLY8/Developer+Guide
28. Maple,Simon,Shelajev,Oleg,Muuga,Sigmar,White,Oliver.“TheGreatJavaApplicationServerDebatewithTomcat,JBoss,GlassFish,Jetty&LibertyProfile.”RebelLabs.RebelLabs,May21,2013.Web.http://zeroturnaround.com/rebellabs/the-great-java-application-server-debate-with-tomcat-jboss-glassfish-jetty-and-liberty-profile/
29. White,Oliver.“DeveloperProductivityReport2012:JavaTools,Tech,Devs&Data.”RebelLabs.RebelLabs,May15,2012.Web.http://zeroturnaround.com/rebellabs/developer-productivity-report-2012-java-tools-tech-devs-and-data/
30. O’Grady,Stephen.“NewRelicandtheStateoftheStacks.”RedMonk.Redmonk,June13,2012.Web.http://redmonk.com/sogrady/2012/06/13/new-relic-stack-data/
31. Noll,Michael.UnderstandingtheParallelismofaStormTopology,2012(http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/).
APPENDIX A. COMPARISON OF PROTOTYPE AND EXISTING SYSTEM PROCESSING CONTROL FRAMEWORKS
Page 41 of 42
APPENDIX A. COMPARISON OF PROTOTYPE AND EXISTING SYSTEM PROCESSING CONTROL FRAMEWORKS
Page 42 of 42
Thisisthelastpageofthedocument.