the ultimate logging architecture - you know you want it!

Post on 13-Jul-2015

929 Views

Category:

Technology

5 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Ultimate Logging ArchitectureYou know you WANT it!

Michele Leroux Bustamantemichelebusta@solliance.net

@michelebusta

http://solliance.nethttp://michelebusta.com

The Hello WorldOf Logging

1992

HelloWorld!

HelloWorld!

Logging Today2014

WebBrowsers

MobileApps

ClientApps

Why do we log?

• Troubleshooting visibility

• Security audits, review, early detection

• Post incident forensics

• Track change history

• Insights into user activity

• Reporting and analysis

What to log?

EXAMPLE:

Application EventsWindows Logs

IIS LogsTrace Output

EXAMPLE:

Login AttemptsUnauthorized/

Authorized AccessPassword Resets

EXAMPLE:

Session TracePurchase Flow

Report GenerationFeature Access

EXAMPLE:

Change history for any critical system

records

Event Logs Audit Logs Activity Logs History Logs

Live Streaming / Analytics

Make LoggingEASY

Implement a Log Helper

ILogger

Logger

TraceDebug()

TraceInformation()

TraceWarning()

TraceError()

Throw()

Logger.Current.TraceInformation();Logger.Current.Throw(ex);

Failure is NOT an option.

Event Logging

Just Do It

• Whatever is built in

• Whatever you know best

• Just do it

Encapsulate the Mechanism

ILogger

Logger

ELMAH / SLAB Azure Diagnostics log4j / log4net ElasticSearch

Audit Logging

Logs and Compliance

• Contain no user credentials

• No PII, PHI or identifiable user data

• Retention period (1 year is good baseline)

• A structured archival process

• Alert if log reaches capacity

• Authorized access

• Protections from modifications (write-only)

Implement an Audit Helper

ILogger

Logger

Tracexxx()

Throw()

AuditLogger.Current.Write();AuditLogger.Current.Throw(ex);

IAuditLogger

AuditLogger

Write() Throw()

Event Logs Audit Logs

Logger.Current.TraceInformation();Logger.Current.Throw(ex);

AzureBlobs

DocumentDB

Benefits of noSQL

• Log details tend to evolve

– Schema-less storage is best

– Re-indexing may be necessary

• Co-location with mainline databases

– Adds complexity and overhead (potentially)

– Does not allow a separate “evolution” team around telemetry and analysis

Audit Log Use Cases

• Every login attempt (success or failure)

• Excessive login attempts and lockouts

• Blocking/blacklisting users, IP addresses, access ports

• Every logout

• Every modification to user table, including permissions

• All configuration changes

• Attempts to access restricted resources, APIs from unexpected paths

• All access to PII / PHI in an individually identifiable way

Audit Log Fields

• Date/time of event• Machine name/instance• Process ID• User ID (possibly encrypted) / Session ID• Type of event• Success or failure of the event (if applicable)• Seriousness of the event violation (if applicable)• Message (free form)• Stack Trace (if applicable)

History and ActivityLogging

History Logs

• Changes made to key tables

• Describes

– Who changed the record?

– From which application?

– Which fields changed?

• Need the ability to surface this to applications

– Sometimes to users

– Always to operations to solve problems

Implement a History Log Helper

IHistoryLogger

HistoryLogger

HistoryLogger.Current.Write();

History Logs

DocumentDB

Users

Orders

ClaimsClaims

Claims

Wrap History in the DAL

History Logs

OrdersDal

UsersDal

ContentDal

Relational DB

Users

Orders

Claims

Content

Wrap History in the DAL

History Logs

OrdersDal

UsersDal

ContentDal

Relational DB

Users

Orders

Claims

Content

What happened with my order?

History Logs

OrdersDal

UsersDal

ContentDal

Relational DB

Users

Orders

Claims

Content

Activity Logs

• Not specific to code execution and troubleshooting, diagnostics

• Specific to the application, user activity

• COULD be informative to users as well– History of recent activity in the site

– Reports they requested, downloads, other…

• Provides insights to the business regarding user activity, trends and patterns– Non-critical analysis

Implement an Activity Log Helper

IActivityLogger

ActivityLogger

ActivityLogger.Current.UserDownload();ActivityLogger.Current.ReportRequest();ActivityLogger.Current.PurchaseOrder();

Activity Logs

DocumentDB

What happened with my order?

History Logs

OrdersDal

Relational DB

Orders

Activity Logs

Automate Logging Where Possible

• View controllers

• API controllers

• Authorization hooks

• Outbound calls

• Data Access layers

To QueueOr NOT To Queue

Event Logs Audit Logs Activity Logs History Logs

Loggers

Client and Server Logging

WebBrowsers

MobileApps

ClientApps

Mobile API Client API Log API Client API Log API

What can I queue?

Event Logs Audit Logs Activity Logs History Logs

Loggers

ETWDocDB

ETW Goal

Event Logs Audit Logs Activity Logs History Logs

Loggers

ETW

HistoryPublisher

ActivityPublisher

Audit Publisher

Events Publisher

Stream Analytics

ALERTS

Queued Logging

• Considerations– Timestamps matter

– Correlation across nodes matters (to a point)

– Guaranteed exactly one in order doesn’t exist

– Async is good (mostly)

• That said– Priority matters (hot, warm, default)

– Simplicity matters

– Throughput matters

TroubleshootingIs Important!

Problem Statement

• We need immediate access to what the HECK is going on when there is a problem

• Sometimes I use (in order):

– Google Analytics

– Event Logs (Azure Website)

– Table Storage queries (STRIKE THAT, USELESS)

– Blob storage CSVs (good enough, not realtime)

Elasticsearch Architecture

Elasticsearch

Logger AuditLogger HistoryLogger ActivityLogger

Kibana Visualization

LogStash

LogStash

Elasticsearch

Identity Server Web Server / IIS /

Event LogsCPU / Memory

Perf Counters

Blob CSVs …

Archives, Aggregation and Analytics

ARCHIVE

Elastic Search

Audit Logs

Activity Logs

History Logs

HDInsight

PoweShellSpin up, analyze, spin down

Ingest

Blob

Storage

Event Logs

OR, just…

What you’re looking for is…

• Manageable implementation

• Ability to “evolve” log content

• Reduce IO / socket overhead (monitor this)

• Prioritization

• Real-time analytics, troubleshooting

• Accessibility for UI lookups (history, activity)

• Archival and mass analysis

References

• Conference resources:

– http://michelebusta.com

• Contact me:

– michelebusta@solliance.net

– @michelebusta

• Founder, CIO of Solliance

– http://solliance.net

top related