inside azure diagnostics
DESCRIPTION
** Session from Pittsburgh Tech Fest - June 2014 **TRANSCRIPT
![Page 1: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/1.jpg)
Inside Azure Diagnostics
Pittsburgh Tech FestJune 7th, 2014
![Page 3: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/3.jpg)
17
COLUMBUS, OH OCTOBER 17, 2014 CLOUDDEVELOP.ORG
Call for
Speakers
-- June 27th --
![Page 4: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/4.jpg)
Today’s Agenda
1 / The need for diagnostic data in cloud applications
2 / Data we can we monitor
3 / Using the Microsoft Azure Diagnostic Agent
4 / Real-world guidance for troubleshooting Microsoft Azure
apps
![Page 5: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/5.jpg)
Success vs. FailureSuccessful projects share at least one common trait . . .
node.js C# Java
Agile- vs -
Waterfall
![Page 6: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/6.jpg)
Success vs. FailureSuccessful projects share at least one common trait . . .
Diagnostics Data / Telemetry
![Page 7: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/7.jpg)
A True Story
Scenario1 week before date of production launch. “Am I ready?”
Well, we eventually log
any fatal errors, but that’s all.
OH . . .
Logs? Yeah . . .we really don’t have logs.
Let’s run some tests and look at your logs
I guess that’s better than
nothing.
We looked at Azure diagnostic logging but
didn’t see much value in it
![Page 8: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/8.jpg)
A True Story
You’re kidding? Right?
![Page 9: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/9.jpg)
A True StoryScenarioo Determine if solution is production
readyo Deployed as an Azure Cloud
Serviceo No load testso No performance testso No unit testso Very little instrumentation
We have a problemhttp://www.cutedaily.com/wp-content/uploads/2011/11/shockedbaby.jpg
![Page 10: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/10.jpg)
A True StoryResolutiono Step 0 – Enable Azure
diagnostics• Set key performance
counterso Step 1 – Add logging
statements around key functionality• Especially external services
o Step 3 – Test, test, testo Step 4 – Analyzeo Step 5 – Fix it
Scenarioo Determine if solution is production
readyo Deployed as an Azure Cloud
Serviceo No load testso No performance testso No unit testso Very little instrumentation
![Page 11: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/11.jpg)
Instrumentation more important in “the cloud”o Need to have good instrumentation for on-premises
applications
o Cloud – it matters more!
o Distributed environments and serviceso Composite applicationso Reliance on 3rd party vendors . . . such as Microsoft for Azureo Highly automated environmentso Scale out modelo Massive amounts of data
![Page 12: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/12.jpg)
The Cloud Scales
worker roles
web roles
![Page 13: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/13.jpg)
The Cloud Scales . . . You Do Not
worker roles
web roles
Diagnostic Data – 4x
![Page 14: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/14.jpg)
Diagnostic DataWhat data do you gather today?
Performance Counters
Custom Logs(nLog, Log4net, etc.)
IIS Logs
Windows Event Logs
Crash Dumps
![Page 15: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/15.jpg)
Diagnostic Data
Performance Counters
Custom Logs(nLog, Log4net, etc.)
IIS Logs
Windows Event Logs
Crash Dumps
![Page 16: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/16.jpg)
Diagnostic Data – Azure Not so Different
Performance Counters
Custom Logs(nLog, Log4net, etc.)
IIS Logs
Windows Event Logs
Crash Dumps
Azure Storage
![Page 17: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/17.jpg)
Diagnostic Data StorageDiagnostic Item Table Name Blob Container
NameWindows Event Logs WADWindowsEventLogsTable
Performance Counters WADPerformanceCountersTable
Trace Log Statements WADLogsTable
Azure Diagnostic Infrastructure Logs
WADDiagnosticInfrastructureLogs
Custom Logs(i.e. log4net, NLog, etc.)
<custom>
IIS Logs WADDirectoriesTable* wad-iis-logfiles
IIS Failed Request Logs WADDirectoriesTable* wad-iis-failedreqlogfiles
Crash Dumps WADDirectoriesTable* * Location of the blob log file is specified in the Container field and name of the blob in the RelativePath field. The AbsolutePath field contains the name of the file as it existed on the role instance.
![Page 18: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/18.jpg)
Diagnostic Monitor Agent
1. Role starts2. Diagnostic monitor
agent starts3. Diagnostics
configured4. Data buffered
locally5. Data transferred to
storagewad-control-containero Container in Azure blob
storage
![Page 19: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/19.jpg)
Diagnostic Monitor Agent
![Page 20: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/20.jpg)
Configuration Options
Default Configuration
Imperative Configuration
Declarative Configuration
o Trace logso IIS logso Infrastructure
logs
o No transfer
o OnStart()
o Overrides default
o diagnostics.wadcfg
o Overrides imperative
![Page 21: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/21.jpg)
Imperativepublic override bool OnStart(){ // Create the DiagnosticMonitorConfiguration object to use for configuring the monitoring agent. DiagnosticMonitorConfiguration config = DiagnosticMonitor.GetDefaultInitialConfiguration(); // Performance Counter configuration config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration { CounterSpecifier = @"\Processor(_Total)\% Processor Time", SampleRate = TimeSpan.FromSeconds(30) }); config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Log configuration config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Information; config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Event Log configuration config.WindowsEventLog.DataSources.Add("Application!*"); config.WindowsEventLog.DataSources.Add("System!*"); config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning; config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1); // Start the diagnostic monitor with the new configuration DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", config); return base.OnStart();}
Impacts local agent only!
![Page 22: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/22.jpg)
Imperative
Deployment ID
![Page 23: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/23.jpg)
Declarative Configuration using Visual Studio
demo
![Page 24: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/24.jpg)
1. wad-control-containera. Created for each role instance
2. Imperative codea. RoleInstanceManager.SetCurrentConfiguration() – update instance’s
diagnostics.wadcfg onlyb. DiagnosticMonitor.Start() – impacts current instance only; will not
update diagnostics.wadcfg
3. Declarative configurationa. Root of worker role or bin of web role
4. Default configurationa. Last resortb. Collects, but doesn’t transfer to Azure storage
There’s a Precedence
![Page 25: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/25.jpg)
oDeployment Updateo Change configuration and redeploy package
oRemotelyo Visual Studioo APIo Cerebrata Azure Management Studio
Update Diagnostic Configuration
![Page 26: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/26.jpg)
On-Demand TransferInstruct WAD to transfer specific data sources to storageSpecify which data sourcesSpecify time range to transferSpecify a notification queueCode or API (or tool)
Overwrites current diagnostic configurationUse sparingly . . . . With caution
More info see http://msdn.microsoft.com/en-us/library/gg433075.aspx
![Page 27: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/27.jpg)
Bonus: Verbose LoggingAdditional host-level data – not DiagnosticAgent.exe
WAD*deploymentID*PT*aggregation_interval*[R|RI]Table
Aggregation at 5 minutes, 1 hour, and 12 hour intervals
10 day retention period
![Page 28: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/28.jpg)
Let’s Get Realo Sample every 1 minute*o Transfer every 5 minutes*
o Transfer only what is needed
o Azure Diagnostics writes data in 60 second wide partitions
o Too much data could overwhelm the partition
* Don’t take my word for it. You don’t know me. Test and validate for your situation.
![Page 29: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/29.jpg)
Query Azure Diagnostic Data
demo
![Page 30: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/30.jpg)
o Two separate channels for telemetry dataoVital informationo Application or service failures. Higher level of alerting.o Fix and return to “normal” as soon as possibleo Alert now – email, SMS, dashboard, ninjas from ceiling, etc.
oDay-to-day operational datao Root cause analysisoHow to prevent in the futureo Azure diagnostics
o Fine tune the alerts – reduce false alarms and noise
Set Priorities
![Page 31: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/31.jpg)
Define Key Metrics
Compute node
resource usage
Windows Event logs
Database queries
response times
Application specific
exceptions
Database connection & cmd failures
Microsoft Azure
Storage Analytics
Process for Azure hosted solutions is not that different from traditional, on-premises solutions.
![Page 32: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/32.jpg)
o Log all calls to external services. Challenge an SLA?
o Log details of transient faults
o Partition telemetry data by date (or hour) – reduce impact of data aggregation or reporting
o Use a different storage account!
o Remove old / non-relevant telemetry data
o Apply to development, test, and QA versions – validate performance & ensure telemetry systems operating correctly
Considerations
![Page 33: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/33.jpg)
o Bring Azure diagnostic data into relational databaseo Easier reportingo Periodically fetch from Azure table and insert into SQL Database table.
Use PK and keep most recent.o Custom code
o Supplement Azure diagnostics with other toolso New Relic or AppDynamicso Cerebrata Azure Management Studioo AzureWatch (Paraleap)
Considerations (cont.)
![Page 34: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/34.jpg)
o Instrumentation and telemetry are key to successful projects
o Cloud metrics similar to metrics for traditional applications
o Be realistic and set priorities
o 3rd party tools can be essential tool for troubleshooting
Summary
![Page 35: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/35.jpg)
o Diagnostics Configuration Order of Precedence – http://bit.ly/1eomek9
o Use the Azure Diagnostic Configuration File – http://bit.ly/1mVHN3u
o Cloud Service Fundamentals (wiki) – http://bit.ly/1k1YkjI
o Failsafe: Guidance for Resilient Cloud Architectures – http://bit.ly/Q33mkU
o Best Practices for the Design of Large-Scale Services on Windows Azure Cloud Services – http://bit.ly/1qp4omC
Resources
![Page 36: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/36.jpg)
oMulti-part series on Azure diagnostics
oMany other fantastic articles:o Azure storage queueso Cloud Serviceso Automated testing in Azure
Just Azure
www.JustAzure.com
![Page 37: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/37.jpg)
Questions?
![Page 38: Inside Azure Diagnostics](https://reader033.vdocuments.us/reader033/viewer/2022052901/556cca01d8b42aba548b507d/html5/thumbnails/38.jpg)
Thank You!Michael S. CollierPrincipal Cloud Architect
[email protected]@MichaelCollierwww.MichaelSCollier.com