new relic in action at trainline
TRANSCRIPT
Some vitals
• ~40 Environments • over 1000 servers • over 100 products
• Windows/.NET • New Relic .NET agent / Server Monitor • Automation is key!
Before New Relic
• Application errors logged to disk • Production support team look at logs – After production issue identified from customer
reports – After platform release to check change in patterns
• Ad-hoc and reactive • Errors difficult to reproduce as usually hours/
days after the event and out of context
Introducing New Relic at Trainline
• Zero capital outlay, subscription model, up and running in an hour
• Identified a product: leisure website • Continuous delivery pipeline with blue/green
deployments to all environments • Needed solution for continuous monitoring
Introducing New Relic at Trainline
• New Relic agent / server monitor part of application server image
• Deployed with high security enabled • Out of the box – Near-real time error logging / alerting – Application / end-user performance – Deployment markers – User funnels
Immediate value
• Error rate as a team key performance indicator
• Drive down error rate through weekly health checks
• Remediate top three errors by adding directly to dev team backlog
• Stack traces visible and actionable by developers without further analysis
Taking it further
• Roll out New Relic across all machines in all environments – New Relic installed on base images for new
machines – Else use SCCM to manage installation
Application/server monitoring built in and zero effort for dev teams
Taking it further
ü Custom attributes
• Mimic high security mode in newrelic.config– Create and deploy Chocolatey package through Chef /
SCCM • Observations: – New Relic .NET agent doesn’t check in to verify
highSecurity setting matches once it has started
<highSecurityenabled=“true”/>
More value…
• Use custom attributes to augment Transaction and PageView events with more information to form other business metrics.
• Phoenix’s real-time payments dashboard – Spread of payment methods – Effect of payment outages
How Trainline uses New Relic
• Monitoring/Production Support for near real time running health of system
• Product owners home in and use funnels to prioritise product effort and spend
• Developers get rapid feedback on new features
• Management get a holistic view of the system through the map feature
What we’d like to see we’ll be seeing soon
ü Javascript errors in Insights ü node.js application errors in Insights ü Better Javascript stack traces
• Per application retention period in Insights • Full .NET async support