field support toolbox - debug procedures
DESCRIPTION
Field Support Toolbox - Debug procedures. Nick Hurd Technical Director CMSgateways.com. CONNECT / DIRECT is a vital component necessary for Electronic Health Information Exchange Documented success of CONNECT/DIRECT systems Many installations Fulfills various requirements - PowerPoint PPT PresentationTRANSCRIPT
-
Field Support Toolbox
- Debug procedures
Nick HurdTechnical Director
CMSgateways.com
CONNECT / DIRECT Field Support OverviewCONNECT / DIRECT is a vital component necessary for Electronic Health Information ExchangeDocumented success of CONNECT/DIRECT systemsMany installationsFulfills various requirementsRequirements vary depending on participantsExample: DoD (HW security) vs. other participants (SW Security)Continuous operation will require field service supportRequires communications between different vendors, modules & versionsMany interdependent stages (hops)Troubleshooting dependencies, updates, inter-operabilitySystem problem resolution can require hours/days/weeksReliable operations will require efficient field supportProcesses, tools, personnel, training, documentationField service tools expedite CONNECT / DIRECT acceptance
CMSGateways.com
CONNECT/DIRECT Case study:CMS Electronic Report workflowCMSGateways.comHealth CareProvider CMSFeedbackQuality ReportPHI
CMS electronic report requirementsCMSGateways.comValidityIntegrityPrecisionReliabilityTimelinessAccessSecurityCMSReporting RequirementsFeedbackQuality ReportPHIHealth CareProvider CMS
Unique modules from different vendors implement and verify each requirementCMSGateways.comHealth Care Provider CMSFeedbackCONNECT /DIRECTQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecurity
Data logjam - One problem can stop workflowCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityWheres myreport?CONNECT /DIRECT
CONNECT/DIRECT Field Support OverviewCurrent Problem Determination (PD) process characteristicsLabor intensive diagnosisManually assemble, correlate, and interpret logsRepetitive, time consuming problem resolution tasksAdvanced skills and extensive debug time (hours/days) required System design has impact on PDAre PD diagnostics integrated into code paths?CONNECT 4.x has begun integration of PD logs & metrics! Poor problem determination processes & lack of PD tools lead toIncreased cost of ownershipDecreased utilizationDecreased market shareDisconnected & mothballed technologyCMSGateways.com
CONNECT/DIRECT Field Support OverviewField Support Goal: Improve maintainabilityAutomated diagnostic tools Reduced downtime Streamlined diagnostic processes Reduce cost of supportComponents of maintenance:ReliabilityOptimize MTBF (Mean Time Between Failure)AvailabilityTotal time a system is expected to functionMean Time Before Repair (MTBR) ServiceabilityEase of maintenance & repairMinimize MTTR (Mean Time To Recovery/Repair)RAS Reliability, Availability, Serviceability
CMSGateways.com
Different modules implement and verify each requirementCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecurityCONNECT
Problem scenario #1Data logjam - One problem stops workflowCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityWheres myreport?CONNECT
Current Debug process - step #1: Manual review of all Logs CMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityLOG1LOG2LOG3 LOGnLOG1LOG2CONNECT
Current Debug process - step #2: Detailed review of log of offending moduleCMSGateways.comHealth Care Provider CMSFeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessSecurityCertificationListCorruptedLOG2No valid Access listCONNECT
Problem scenario #2 Interactive problems -> Increased MTTRCMSGateways.comHealth Care Provider CMSFeedbackIntegrityValidityPrecisionReliabilityTimelinessAccessSecurityDataSourceAccessIntegrityValidityPrecisionReliabilityTimelinessAccessSource verificationSecurityLOG2No valid Access listLOG7No accessListDatacommSecurityCONNECT
Problem Scenario #3: Entire system deadlockedCMSGateways.comHealth Care Provider CMSNO FeedbackQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessCONNECT
Current Debug process - step #1: Manual review of all Logs => unusableCMSGateways.comHealth Care Provider CMSQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessLOG1LOG2LOG3 LOGnCONNECT
Diagnosis: EXPIRED log account -> Halted log file creationCMSGateways.comHealth Care Provider CMSEXPIRED LOG ACCOUNTQuality ReportPHIIntegrityValidityPrecisionReliabilityTimelinessAccessDataSourceIntegrityValidityPrecisionReliabilityTimelinessAccessSecuritySecuritySecurityAccessLOG1LOG2LOG3 LOGnCONNECT
CONNECT/DIRECT Field Support OverviewProblem Determination (PD) components Problem management disciplineAutomate Maintenance functionsIdentify RAS tools requirements (Reliability, Availability, Serviceability) PD workflow procedures PD query processPD environmentsRAS tool solutionsOpen source vs. proprietaryDiagnostic information from variety of sources
CMSGateways.com
CONNECT/DIRECT Field Support ObjectiveProblem Management DisciplineProblem Documentation: Confirm, categorize, prioritize & publishAcquire relevant Problem Determination (PD) dataAutomate common PD support tasksInvolve all participants: Users, field support staff, 3rd partiesExample: Xref problems lists from other bugs & third party modulesApply tools => observe & control systemExpedite the identification of fault source(s)PD data analysis (Dev team, test team or Field support)Transform intermittent bug => regular bugResolve the mystery cause(s)Implement Bug fix (w/ no side effects)CMSGateways.com
CONNECT/DIRECT Field Support WorkflowDiagnostic workflow procedures Goal: Acquire relevant diagnostic data Understand operationsCartography - Functional map of complete systemInternals: Modules & data flow Externals: Protocols & states of transactionConfiguration, version controlStandardized update proceduresModule interdependenciesTools and Diagnostic data acquisition processesExtend development & test bench into fieldEnable Users & Field personnel to collect USEFUL diagnosticsCMSGateways.com
CONNECT/DIRECT Field Support ToolsProblem Determination (PD) automation toolsAutomated data collection Configuration, Input/output, status, versionHeterogeneous environment modules & subsystemsDiagnostic APIs: Logs, traces, events, signals, exceptionsForensic data mining Log merge, parsing, sorting & analysisIdentify events leading up to problemIsolate source(s) of problems
CMSGateways.com
CONNECT/DIRECT Problem Determination (PD) ComponentsCMSGateways.com OS
SignalsLogsTracesExceptionsAssertDrivers/DLL JVM
FiltersFormattersDiag Info SourceAPIsView & Analysis ToolsModifiers App Svr
CONNECTAppThread(s)DBMSNet SocketMem BuffOutputStreamConsoleOutput optionsFile SYSTEM 2SYSTEM 3SYSTEM 1
PD considerationsCMS Quality Report workflow pathsCMSGateways.comCONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIECONNECT
CCD / PQRSVettingXML ParserFile ManagementDBMSProviders CMSHIEHIEHIEHIT matrixHISPHIEHIHXCAXCPDPHRSecurity/Access
Problem Determination (PD) QueriesProblem Determination Workflow proceduresPD queriesAccurate problem report?Different system?Different state?Different data?Complete problem report via PD queriesUser interviewDiagnostic data acquisition PD procedures
CMSGateways.com
Problem Determination (PD) Query #1Is this problem report / observation accurate?Corrupted problem recordIncomplete, unreliable communicationsMisattribution / false correlationIntermittent problem misconstrued => non-intermittent problem w/complex and unlikely set of causes (MSWord=>Win crash)MisrepresentationIncomplete assessment (PS3 malfunction, hidden connector was unplugged)Different operators have different problem tolerances and sensitivitiesSensitivity and vary with time of dayIrrelevant problem (i.e. Observation is too accurate )CMSGateways.com
Problem Determination (PD) Query #1PD information categories - problem reports Timestamp, PD environment, priority, classification, scope of problemLog augmentation: Track multiple entries by multiple authorsCMSGateways.com
Problem Determination (PD) Query #2Is it a different system?Automatic or IT updates Trespassing system - foreign intrusionsConfiguration changesThird party add-ons affect code pathsDrivers, driver stacks, DLLs, apps, monitorsDocumentation & processes in placeAutomated version comparison / control programsRollbacks & version control co-ordinationThird partiesDocumented version inter-dependenciesCMSGateways.com
Problem Determination (PD) Query #3Is system in a different state?System in different mode?User or protocol may have set different mode Improper initChanges in config, registries, resources & routing tablesResource denialFile, stream, or other resourceCorrupted, does not exist, locked by another process/threadOccasional functionsAuto-save, periodic maintenance, internal garbagecollectProgressive data corruption (timing loops, rounding)Progressive destabilization Destabilizing event create wild pointerInitiating event Use wild pointerCMSGateways.com
Problem Determination (PD) Query #4Did system receive different data?Secret / different boundaries and conditionsSoftware may act differently in different parts of input spaceDifferent logic invoked by chosen option(s) Input corruptionInputted corrupted or intercepted Deus ex machina - Third party influenceFellow developer/tester, other user, hackerAccidental or Ghost input Signals from different peripherals, networksun => Optical mouse RTF from MS Word & MS Wordpad are not the same Consider time & loading as an input
CMSGateways.com
Problem Determination (PD) ProcessesPD EnvironmentsDevelopment, System Test, Multi-System Test, Field Install PD Tools Scope of diagnostic dataSystemwide, Server, Application, ModuleComponent interactionsTool providers: Open Source & Proprietary Setup communications between all of the above!
CMSGateways.com
Problem Determination (PD) environment #1 Software DevelopmentSoftware Development environmentInteractive Debugging - IDE / Eclipse (or ?)Call stack, variables values, BreakpointsPrintf debugging / TRONASSERTPost-Mortem Debug crash analysisSemantic errors - Static code analysis toolsCMSGateways.com
Problem Determination (PD) environment #2 System test suiteSystem test suite environmentPurpose: Decrease costs of functional defectsEach Development stage has associated defect resolution costsRequirements, Arch, Construction, System test, Post releaseDefect costs more if caught at later stageField Support => multiple updates => configuration changesCloud/Continuous deployment reduce costs of later stagesTest Input combinations and preconditionsAutomated finite combinational testsGet greater test coverage with fewer testsCompromise test speed vs. test depth Need coverage of non-functional attributesUsability, scalability, performance, compatibility (version), reliabilityCMSGateways.com
PD environment #3Inter-system bench testInter-system bench testControlled environmentVersion, loading, data mixMulti-vendor, multi-moduleMultiple overlapping errors increase PD complexityControlled debuggingDedicated offline systems => remote test bedProblem determinationBalance performance with Serviceability (RAS)Automated data collectionTest offline analysis procedures - automated & manualCMSGateways.com
PD environment #4 Field InstallCustomer Install - Field ServiceUncontrolled environmentVersion, loading, data mixMulti-vendor, multi-moduleMultiple overlapping errors increase PD complexityOnline, live debugging#1 Goal of Field Support Keep system online!Can dedicate extra system as remote test bedProblem determinationBalance performance with Serviceability (RAS)Automated data collectionOffline analysis - automated & manualCMSGateways.com
PD debug mode #1 => Source debugLogic debug of an app moduleHard faults - ASSERT Usually removed from production codeIntermittent problems Stress system to recreate problemIf race condition exists, usually affected by debug processThreading ,memory management issuesDebugger affects timing, can exaggerate or solve problem.Fuzz tests w/random input => irrational border casesCMSGateways.com
PD debug mode #2 API debugProblems between system componentsHeterogeneous environmentMust track version history of (related) subsystemsInter-DependenciesScripted automated compare look for version deltaAutomated test scriptsVersion dependencies Example: NwHIN protocolsOptionsRace conditions Test configurations => vary timingSystem loading Test configurations => vary sources, sinks & data loads
CMSGateways.com
Inter-System Datacomm PDCMSGateways.comFEEDBACK
Quality Report
CONNECT& other subsystemsIE_EHRPMDBMSHIEsCONNECT
Vetting SWParser SWFile ManagementDBMSClaimsProvider CMS
PD debug mode #3 - CommunicationsCommunication protocols between systemsPD Transaction AnalysisBetween CONNECT and trading partners such as.NIST: Conformance testing against a referenceOther vendors:Interoperability (@ IHE connectathon)CONNECT V4.0 incorporates PD Metric & Error Logs Performance Transaction Type, Payload Error Messages logXDS.b Transaction/datacomm tools & reference [email protected] Test Tools -> http://hit-testing.nist.gov:12080/xdstools2Connectathon: http://www.ihe.net/connectathon/
CMSGateways.com
PD debug mode #4 SecuritySecurity Management problems CERT management a time consuming debug issue!Default certificate configuration Obtaining signer certificate from a remote portRemote signer certificate retrievalValidating a remotely-retrieved signer certificate Replacing certificates and signersCertificate expiration monitor and dynamic run time updatesAdvanced certificate and key management issues CERT management toolsWebsphere GUI admin consoleWindows command line => certmg.exeCMSGateways.com
PD debug mode #5 Intermittent bugField Multi-System Intermittent problems Field Support procedures & tools requirementsSupport Multi-vendor environmentsVersion dependencies of multiple modulesDisparate data sourcesAutomated data collectionMinimize expertise required for data acquisitionAutomate module / code path analysisOffline analysis merges diag data from different sourcesMinimize and localize Performance tradeoffsServiceability (RAS) ANDSystem loading, throughput, stabilityCMSGateways.com
PD Doc #1Automated Version Documentation! CMSGateways.com OS
Composite VERSION
Drivers/DLL JVM
System VERSION
VersiondocumentationSCRIPTFCscriptSYSTEM 2SYSTEM 3SYSTEM 1VERSIONSVersionCompare Tool(s) App Svr
CONNECTAppDBMSCompositeVERSION(Yesterday)CONNECTVERSION
PD Doc #1 - System config docsSystem DOCUMENTATIONTimely automated gathering of CONFIGModules / subsystems / OSALL VENDORS!Date, time, checksumsAutomated, scripted comparisonEstablish Version / Change history Immediately spot any deltasHelps to map out updates, rollbacks, hotfixes, etc.Some people rely on dump/trace/log for same infoDeltas are not easy to extract and compare
CMSGateways.com
PD Doc #2 - Application Logs
Instrument your code!Log statementsLog data categoriesPerformance counters ( system loading )Stack traces Race conditions ( timeout counters )CMSGateways.com
PD Doc #2A App Log via JVMCMSGateways.com OS
SignalsLogsTracesExceptionsAssertDrivers/DLL JVM
Info SourceAPIsView & Analysis ToolsCONNECT LOG CODEDBMSLOGFILEJAVAConsoleJAVA Admin
Java JVM LogLoggingredirect Java Console output to log file via Java Logging API. To enable logging perform the following actions:Open Java Control Panel / Admin panelClick Advanced tab. Select Enable Logging under the Debugging option CMSGateways.com
Java Log optionsOptions:Redirect system.out & system.err To log fileTo network socketTo OutputstreamTo mem bufferRotating Log filesFormattersXML or TextLevels:Severe, warning, info, config, fine, finer, finest
CMSGateways.com
App Log control (>JDK 1.4)CMSGateways.com OS
SignalsLogsTracesExceptionsAssertDrivers/DLL JVM
Info SourceAPIsView & Analysis ToolsCONNECTLOG CTLCODEDBMSLOGFILEJAVAConsoleJAVA AdminFiltersFormattersModifiersXMLTextFine, finestNet SocketMem BuffOutputStream
CMSGateways.comJAVA Logging Framework
Native JVM log components - functionsCMSGateways.comSOCKETCONSOLEFilter to exclude messagesWith a particular keyXMLBUFFERFILETxtConfigurationPer class
More options Open Source log4JSun Java Log APIUniversalNo external dependenciesGenerally included in proprietarylog4J Log APIIBM ported RAS code => Java => Open SourceMore output optionsFlexible configLonger history, smaller footprint, faster, thread safe
CMSGateways.com
log4J More output optionsCMSGateways.comUnix SyslogEmailFilter to exclude messagesWith a particular keyNT event logSOCKETCONSOLEBUFFERFILEXMLTxtHTMLTTCCFormatter Layoutthreadid, class, etcConfigurationPer class / per thread
Other log4J Log improvementsImproved PerformanceAsynchronous loggers10x throughput and orders of magnitude lower latencySupport for multiple APIsSLF4J Simple logging faadeUSER plugs in log framework at deployment timeCommons Logging Change logging implementation without recompilationAutomatic Reloading of ConfigurationsWithout losing log events while reconfiguration is taking place. CMSGateways.com
(PD) Mechanisms JVM TraceCMSGateways.com OS
SignalsLogsTracesExceptionsAssert JVM
Info SourceAPIsView & Analysis Tools App Svr
CONNECTAppDBMSMemCircularTracebufferJAVA ConsoleJAVA Control PanelJava.plugin.trace.optionFileBasic, cache, net, security
ext, liveconnect all
Java TraceSet initial trace level for Java Web Start applicationChange trace level with API, trigger eventsJVMRI (IBM - RAS Interface, deprecated)JVMPI (Sun Profiling interface, deprecated)JVMTI (JVM / Oracle / IBM Tools interface, current) Set the deployment property deployment.trace.level. Basic, cache, net, security, ext, liveconnect, all
CMSGateways.com
CMSGateways.comProblem Determination SolutionsOpen source PDExample: log4JAdvantages:Source available for debugging/extensionsSmall scale projectsCan be customized to emulate proprietary functionalityProprietary PDSystem examples: Websphere, WebLogicAdvantagesSubsystem integration & testing version controlPD tools => problem determinations cover more system components
WebLogic Log DiagramCMSGateways.com
IBM Websphere LOG extensionsIBM extensions of log4J Logging domains Nested Diagnostic Contexts (NDC) Mapped Diagnostic Contexts (MDC)CMSGateways.com
Advantages - Proprietary SolutionsIBM WebsphereJVM log + log4J + proprietary extensionsIntegrate Mainframe experienceStreamlined binary log/trace 3x fasterMulti-Server Log mergeAdvanced Filtering and Admin consolesMerged Open source with proprietary extensions
CMSGateways.com
Expand scope of debug info to AppCMSGateways.comHealth CareProvider CMSFeedbackQuality Report(PQRS)PHI - XML
Expand scope of debug info to App w/many vendors & transactionsCMSGateways.comFEEDBACK
Quality Report
CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIEsCONNECT
Vetting SWParser SWFile ManagementDBMSProvider CMS
Users want system totally functional Debug tools => systemwide solutions!CMSGateways.comProvider CMSFEEDBACKVettingPre-SubmissionSubmission
Incentives
DisincentivesQuality ReportParticipants RolesDate/TimeLocationsVitalsLab ReportsCONNECT& other subsystems Users want problem resolved ASAP User care about MTTR (Mean time to Recovery/Repair)System DeltasNeed to be bridgedTransactionsRemote procsError Handling
Tools must be able to identify the many sources of system fault(s)CMSGateways.comFEEDBACK
Wheres myfeedback?Quality Report
Wheres myreport???CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIEsCONNECT
RoutersVetting SWParser SWFile ManagementDBMSProvider CMS
Each subsystem has diagnostic logs Multiple logsSystem wide vs. app specificDefined interfacesImprove code maintenanceScope of diagnosticsSystem wide vs. app specificDefined vs. custom interfacesTradeoffsInteraction Other system componentsOther AppsExternal systemsImpact on system performanceCommunications abilityCMSGateways.comJVM
App Server
OS
Need for composite logsMultiple log functions Sync and parseSystem wide &. app specificDefined interfacesImprove SYSTEM maintenanceScope of diagnosticsSystem wide All interfacesCMSGateways.comJVM
App Server
OS
System Support - delegationHandoff of system supportFrom Programmers to Field supportPlanned transitionEnable programmers to be more efficientCONNECT ImprovementRAS Reliability, Availability, Serviceability(Semi) automatic problem resolutionSystem Modularity
CMSGateways.com
CMS report pathways (2014/2015)CMSGateways.comDIRECTQuality Report(PQRS)PHI - XMLLogs / AuditHealth CareProvider CMSIE_EHRPMDBMSReportGeneratorFEEDBACKVettingParserFile ManagementDBMSSource ControlSMTPXDS.bX.509S/MIME`CONNECTCONNECT/ DIRECTODBC
CMS report componentsCMSGateways.comHealth CareProvider CMSQuality Report (PQRS)PHIPatient medical recordFeedbackSection a. (PM)Org / Provider / DatesICD / CPT/ DRGSection b. (_EHR)Vitals & Labs ResultsSYS BP = xxxCONNECTRegistryLogs / AuditRepositoryCoreServicesGatewayMPIClient InterfaceVettingParserFile ManagementDBMSSource Control
Review CONNECT Field SupportCoordinated Problem Determination (PD)Goal: Improve RASIncrease Reliability, Availability, ServiceabilityMilestones to goalProblem management disciplineProblem determination workflow procedures RAS tool solutionsOpen source & ProprietaryVendor choice(s) affects procedures, staffing & MTTRMTTR (Mean Time To Recovery/Repair)
CMSGateways.com
Review CONNECT/DIRECT PD processesStandardized Field Support RAS proceduresEnable field support and non-programmers to extend supportCollect USEFUL diagnostic infoStart initial diagnostic processInteract with advanced diagnosticsDiagnostic document workflow and debug procedures Cartography - Functional map of complete systemUnderstand Diagnostic data flow - modules & protocols Problem Determination (PD) automation toolsAutomated data collection Diagnostic APIs: Logs, traces, events, signals, exceptionsForensic data mining => log parsing, sorting & analysisIdentify events leading up to problem, Isolate source(s) of problemsCMSGateways.com
Problem Determination (PD) ToolsCMSGateways.com OS
SignalsLogsTracesExceptionsAssertDrivers/DLL JVM
FiltersFormattersDiag Info SourceAPIsView & Analysis ToolsModifiers App Svr
CONNECTAppThread(s)DBMSNet SocketMem BuffOutputStreamConsoleOutput optionsFile SYSTEM 2SYSTEM 3SYSTEM 1
CMS Quality Report workflow with CONNECT/DIRECTCMSGateways.comFEEDBACK
Quality Report
CONNECT& other subsystemsIE_EHR(200+ vendors)PMLegacy, cloudDBMSHIECONNECT
CCD / PQRSVettingXML ParserFile ManagementDBMSProvider CMSHIEHIEHIE
Contact Info
We are developing a Field Support Toolbox for CONNECT / DIRECT This toolbox will include a variety of Problem Resolution Tools
Please email any requirements or questions to:
Nick [email protected]
Thank you for participating!CMSGateways.com
************************