engage 2016: back from the dead: how bad code kills a good server
TRANSCRIPT
#engageug
BackfromtheDead:WhenBadCode
KillsaGoodServerEngageUserGroupConference,Eindhoven
March2016
SerdarBasegmez-Developi-@serdar_basegmezWilliamMalchiskyJr.-ESS-@BillMalchisky
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 2
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 3
"Ladies and Gentlemen. The story you are about to see is true; the names have been changed to
protect the innocent." --Dragnet
For example... Acme Corporation is now referred to as Acme, Inc.
Disclaimer
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 4
• Whatwewillcover
• Problemanalysis
• TroubleshooSngskills
• BestpracSces
• TheperformanceimpactofsubopSmalapplicaSons
• Whatweomi[ed
• Boring,rambling,dry,lectures
• Uselessdrivel
Se^ngExpectaSons
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 5
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 6
• "We'rehavingaproblem.Canyouhelp?"
• "Absolutely.What'shappening?"
• "OurmissioncriScalDBisreally$%&@#$^&ourusers.It'swaytooslow.IttakeslessSmetoreboot[Windows3.1onani386with32MBRAM]thantoopenadocument."
• "Anyideawhatchanged?"
• "Wedon'tknow.Wehavenottouchedthebox."
CustomerCalls
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 7
• LackofexperSseand/orknowledge
• Unplannedand/orunexpectedexpansion
• NodedicatedAdministrator
• Nochangemanagement
• Nomonitoring
• Workaroundoverloading
WhyDominoServersFail?
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 8
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 9
• WhilewaiSngforaccess...requestthefollowing
• HelpsestablishthelevelofcriScality
"Round Up the Usual Suspects"
notes.ini log.sfsh tasks top vmstat iosysdf -h User to server ping results
mount swapon -sServer NAB DB copy, sans users
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 10
malchw@san-domino:~$ iostat
Linux 3.13.0-83-generic (san-domino) 03/23/2016 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
6.21 0.25 3.69 0.51 0.00 89.34
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 45.34 2075.44 778.25 6028264 2260469
sdb 0.36 1.52 0.03 4422 80
dm-0 24.51 117.04 186.80 339957 542584
dm-1 16.17 415.61 79.82 1207173 231836
dm-2 17.64 1540.92 511.61 4475713 1485996
malchw@san-domino:~$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 16943764 153144 7941660 0 0 262 98 144 681 6 4 89 1 0
QuickExample-iostat,vmstat
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 11
• RunDCT-returnedafewitems,butnothingapplicabletotheperformanceissueexperienced
• CheckDominostats
• Locatedakeyissue-needleinhaystack
• SAIfluctuatedwildly,frequently,plummeSngto18%forminutesonend
• LocateanyrecentNSDfilesforanalysis
Data,DataEverywhere
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 12
• Watchtheserverwhennobodyelsedoes
• Lotsofstrangethingshappenonserversovernight
• Observedthesystemprocessingoveronemillionrecordsin:15twiceaweek,atdifferentSmes
• Forexample…nooneatAcme,Inc.knewthisoccurredorwhy
ProTiponDataCollecSon
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 13
• Swapspace50%ofinstalledmemory
• Memorywasunder1GBformissioncriScalserver
• SeveralkeyDBscontained100k+docs
• CombinaSoncreatedpagefaulSngplaguefurthererodingperformance
• Systemproperlypatched
• Freespaceadequate
IniSalDataAnalysis-OS
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 14
• Obviousbutimportantdatapoints
• Serverlayout
• Whereitemslocated
• Recognizedserver.idfile
• Servertasks
• Contrasttoshtasksrequestedearlier
• Noobviousproblems
IniSalDataAnalysis-Notes.ini
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 15
• Agentsrunningallhoursofthenightandday
• AgentsrunningfromDBsacSvelybeingcompacted
• AgentsrunningfromDBswhenupdallandfixuprunning
• Notallscheduledagentsneededtorunallweekend
IniSalDataAnalysis-Amgr
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 16
• CompactsSllrunningwhenupdallProgramfires-off
• CompactneverfinishedbeforeexecuSonSmeceilinghit
• LeUlargestDBsinacompletelysubopSmalstate
• Connectedtoserversthatdidnotexist
• ScheduledreplicaSondocuments
• SignificantdelayswithreplicasynchronizaSon
• Ensureddataneverproperlysynchronizedacrossdomain
• CertainconnecSondocumentsonlycoveredtwoDBs
IniSalDataAnalysis-Log.sf
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 17
• SeveralbigDBslastfixupcompletedtwoyearsago
• Mostheavilyusedfiles30-75%Used
• Manyviewsmeansclickingoneforcesanewindexbuild
• Nodesign,document,ora[achmentcompression
• DesignservertaskciSngnon-existenttemplates
IniSalDataAnalysis-DBs
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 18
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 19
• Swapspace-Nosetrulethesedays
• 1.5x-2.0xRAMisgoodruleofthumb
• Memory-4GBperprocessoronbusyservers
• VMwarese^ngsifavailable
• AvoidtemptaSonoftoomanyprocessors
• ReviewparSSonsandfreespace
Tier1-OS
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 20
• CheckthatpreviousmadesystemchangessSck
• Unfamiliarserverscanexhibitoddbehavior
• CheckTechnotesforanyrecentperformanceissues
• OnceOSisworking,checktoensurethatvirtualizaSonisopSmal
AddiSonalOSConsideraSons
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 21
• SpaceproperlyProgramDocuments
• AvoidoverlapwithagentsandotherPrograms
• Pauseagentscheduleduringmaintenance
• Scheduleaweekendtocompletefirstfullmaintenance
• Firstfullcompactwilltakemuchlongerthanyourealize
• Createmaintenancescheduleoftasksagreedtobybusinesslinemanagers
• Ensuresallneededjobsareavailablewhenneeded
Tier2-Domino
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 22
• ReviewallenabledDominofeaturestoensurethattheyfuncSonproperly
• SimpleconfiguraSonmiscuescanimpactnegaSvely
• ClusterreplicaSonunabletolocateaclustermember
• DNSerrorscreatelookupdelays
• Removeunneeded,deprecatednetworkports
AddiSonalItemstoFix
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 23
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 24
• DominoAdminhandledthefirstleveltreatment
• Serverperformswell,butnotgoodenough
• Triangulatedtheissuetoamission-criScalapplicaSon
• Nowwhat?
WhereareWe?
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 25
• LackofexperSseand/orknowledge
• Developersevolvedfrompowerusers
• Architectureoverloading
• Unplannedand/orunexpectedexpansion
• Undocumentedcodeand/orbusinessprocess
• Nochangemanagement
• Quick&dirtydevelopment
WhyDominoAppsFail?
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 26
• Thereisnomagicpillforfindingaperformanceissue
• ManyproblemsarecircumstanSal
• Dependsonwho/when/how…
• RepeaSngtheproblemonacontrolledenvironment
• NeedforProof!
• Themostdifficultpartofthetask
• NeedtobesystemaKcal
DevelopersvsPerformanceIssues
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 27
• ResearchandAssessment,
• SpeculaSonforfixes,
• Experiment,
• Prove!
ScienceJustWorks!
http://www.wired.com/2013/04/whats-wrong-with-the-scientific-method/
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 28
MethodologyResearch
✤ Symptoms (e.g. logs, performance data, etc.)✤ Story (e.g. user input)✤ Application code
Hypothesis ✤ Speculation on possible reasons✤ Search for ‘Usual Suspects’
Experiment ✤ Testing for possible reasons
Analyze ✤ Check symptoms if fixed
Conclusion ✤ Issue validated and proved to be fixed.
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 29
• Whattocollect,basedonthesymptom;
• CPU/memoryload,hangs,spikes,crashes,etc.
• AlltheSme,thesameSmeeverydayorrandom?
• Experiencedbyspecificusers?
• Wearelookingforapa[ernbetweenincidents.
Research&Assessment
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 30
Log/NSD/Semaphorefiles
ServerconfiguraSon(inc.notes.ini)
ServermonitoringandstaSsScsdata
Weblogs(forwebapplicaSonissues)
XPagesandOSGilogs(forXPagesspecificissues)
ApplicaSonanddependencies
DataCollecSonChecklist
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 31
• SomeSmes,evenopeninginDDEmaycauseissues!
• e.g.XPagescomponentsareautomaScallybuilt
• ApplicaSoncodemighthavesideeffects
• e.g.UpdaSngonanotherdatasource,addingauditlogs,performancedegradaSonontheserver,etc.
• Therewillbedependencies
• Onceisolated,wecanstartinspecSon…
IsolatetheApplicaSon
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 32
• DatabasecorrupSons
• @Today/@Nowinviews
• CodesnippetsacSnglikeanadmin
• UpdaSngviews,replicaSngdatabases,runningservercommands,etc.
• CodesnippetsusingtheworstpracSces
• Searchinalargedatabase,wronglooping,etc.
• Anythingthatfitsintothepa[ernifthereisone
• e.g.AnagentmatchingtheincidentSming
UsualSuspects
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 34
• DeeperinvesSgaSonneedsateamingeffort
• AdminsandDevelopersshouldcollaborate
• AtestsetuptosimulatetheproducSonenvironment
• Intensive/ControlleddebuggingsessionsinlimitedSmewindows
• SharingexperSse
• ExperimenSngonproducSonshouldbethelastresort
• Oncearepeatableerrorfound,cooperateforasoluSon
TeamUp!
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 35
• JVMCrashwiththeHTTPtask
• RandomSmes
• Nopa[erninthelog
• MemorydumpspointaleakintheJVMHeap
• InspectedXPagesapplicaSons,nothingfound
• TriangulatedtheproblemintooneXPagesapp,followingcluesinintensivedebuggingonmemory
• IsolatedtheapplicaSonforaloadtest,nothingfound
• Increasedlogging,tocollectmoredata,nohope!
ExampleCase-Analysis
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 36
• CheckedtheserverconfiguraSonandnoSced
• Loggingdataincomplete
• Removedexclusions
• Newlogspointedtheproblem
• SearchingsoUwarecrawlingaspecificpage
• Pagegeneratesstatedata,fillsupthememory
• Simulatedthesamecrashonthetestenvironment
• Onelineofcodefixedtheissue
ExampleCase-ResoluSon
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 37
• AmissioncriScalapplicaSonatabank
• WebapplicaSonwith2000+users
• CPUspikesandrandomhangs,mostlyaUernoon
• Logsareclear,nocrashes,noerrormessages
• IsolatedtheapplicaSon,inspectedthe‘usualsuspects’
• FoundawebagentupdaSngaview!
• TriangulatedtheproblemusingweblogsandSEMDEBUG
• But,cannotvalidatetheissueonthetestenvironment…
AnotherCase-Analysis
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 38
• CooperatedwiththeDominoAdmin
• DetailedassessmentontheserverconfiguraSon
• Wefoundtheissue!
• “ServerTasksAt14”runninganupdalltask.
• AnotherProgramfilerunningUpdallonaspecificdatabase,every30minutes
• AppliedtothetestplaTorm,validatedbyaloadtest
• Problemsolved!
AnotherCase-ResoluSon
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 39
• Preface
• ChapterI-TheBeginning
• Chapter2-SearchingforClues
• Chapter3-CreaSngaSolidPlaTorm
• Chapter4-TheSoUsideofPerformanceGains
• TheFinalChapter-Results
OurStoryinFortyMinutes
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 40
• Pagefaultsreducedtozero
• GeneralDBusageandadministraSontasksworkwell
• SAInowover80%
• Weirdovernight(agent)systemoperaSonsresolved
• KeyDBshave93%usedspacethresholdnow
• AllDBscompressed:design,documents,alla[achments
• Programdocuments,agentschedulesalladjusted:finish,nooverlap
QualityAnalysisYieldsQualityResults
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 41
NoteonPerformance
Whendoneproperly,fewuserstendtonoScethechange,butifrevertedtheywillallcomplain
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 42
Neitheradminnordeveloper
couldsolvealloftheseissuesalone!
Teamworkvs.Performance
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 43
• YoucangethelpinspecSngapplicaSonsandservers!
• TheyhavealsohelpedEngage!
BonusSlide
cooperteam MartinScott
teamstudio Ytria
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 44
• IBMChampion(2011-2016)
• DevelopiInformaSonSystems,Istanbul
• ContribuSng…
• OpenNTF/LUGTR/LotusNotus.com
• Featuredon…
• EngageUG,IBMConnect,ICONUK,NotesIn9…
• Also…
• BloggerandPodcasteronScienSficSkepScism
SerdarBaşeğmez
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 45
• IBMChampion(2011-2016)
• EffecSveSoUwareSoluSons,LLC
• Co-founderofLinuxfestatLotusphere/Connect
• Speakerat20+Lotus/IBMrelatedevents/LUGs
• Co-authoredtwoIBMRedbooks
• Co-wrotetheIBMEducaSonAdministraSoncerSficaSontrackforDomino8.5
WilliamMalchiskyJr.
#engageug ©2016SerdarBasegmezandWilliamMalchiskyJr.LicensedunderCreativeCommonsBY-NC-SA4.0 46
FollowUp-ContactInformaSon
Serdar Basegmez
@serdar_basegmez Skype: sbasegmez
Bill Malchisky Jr.
@billmalchisky Skype: FairTaxBill