from server to service: how microsoft moved team foundation server to windows azure grant holliday...
TRANSCRIPT
![Page 1: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/1.jpg)
From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure
Grant HollidaySenior Premier Field Engineer
AZR323b
![Page 2: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/2.jpg)
Introduction
TFSPreview.com3 Years in Redmond
![Page 3: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/3.jpg)
What does a Premier Field Engineer do?
![Page 4: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/4.jpg)
Microsoft Services Premier Support
Help customers manage and support their ITTechnical account management and proactive advisory services
Fast issue resolutionThrough packaged and customized offerings
![Page 5: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/5.jpg)
Demo
A quick introduction to http://www.tfspreview.com/
Team Foundation Service
![Page 6: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/6.jpg)
Standing on the Shoulders of Giants
![Page 7: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/7.jpg)
20+ Years of Experience in Internet-Scale Services
James Hamilton
![Page 8: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/8.jpg)
The goal should be that a highly-reliable,24x7 service should be maintainedby a small 8x5 operations staff.
Engineer the problems. Don’t scale the operations team.
![Page 9: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/9.jpg)
Low-cost administration correlates highlywith how closely the development, test,and operations teams work together
![Page 10: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/10.jpg)
The product team is held accountable for the success of the service. This drives the right behaviours.
![Page 11: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/11.jpg)
Team Structure
Brian Harry
Version Control
Work Item Tracking .. Agile Tools
Service Delivery
Team
* Missing a couple of management/organisational layers, but the point is that everybody is on the same team
![Page 12: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/12.jpg)
Evolution of a Service
Source ControlWork ItemsBuildsReportingSharePoint
2005Performance fixesWeb Access
2008Architectural changesScale outFarmsFlexibility
2010Web Access v2
Internet identitiesFile ServiceRetriesEnhanced TracingPartitioning
2012
![Page 13: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/13.jpg)
Service Architecture
Clients
Windows AzureCloud Service
Windows AzureSQL Databases
Windows AzureActive Directory
Load Balancer
Worker Role
Windows AzureStorage Blobs
Web Role
![Page 14: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/14.jpg)
Necessary Additions
Internet IdentitiesWindows Identity Foundation, Windows Azure Active Directory (ACS)
File ServiceMove blobs from expensive SQL storage to cheap Windows Azure Storage Blobs
Fault ToleranceServer-side retry logic to handle transient failures
Enhanced TracingManipulate tracing at runtime with very fine-grained control
PartitioningCo-locate logical customer databases in single physical Windows Azure SQL Databases
![Page 15: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/15.jpg)
Things to think about
Building a Service
![Page 16: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/16.jpg)
Expect Failures
On-Premises Assumptions:Network is solidSQL is availableDedicated servers
Cloud:Shared infrastructureTransient failuresFlexible to cope with variations in usage and loadThere’s no place like Production
![Page 17: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/17.jpg)
Windows Azure SQL Database Errors
Error Number
Error Message Cause
40197 The service has encountered an error processing your request. Please try again.
In case of a hardware failure, SQL Database provides automatic failover to optimize availability for your application. Some failover actions may result in an abrupt termination of a session.
40501 The service is currently busy. Retry the request after 10 seconds.
When soft throttling limit for worker threads on a machine is exceeded, the database with the highest requests per second is throttled.
40552 The session has been terminated because of excessive transaction log space usage. Try modifying fewer rows in a single transaction.
Uncommitted transactions can block the truncation of log files.
![Page 18: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/18.jpg)
Transient Fault Handling Application Blockusing (SqlConnection conn = new SqlConnection(connString)){ // Attempt to open a connection using the // specified retry policy. conn.OpenWithRetry(retryPolicy); // ... execute SQL queries}
![Page 19: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/19.jpg)
Transient Fault Handling Application Blockusing (IDataReader dataReader =selectCommand.ExecuteReaderWithRetry(retryPolicy)){ if (dataReader.Read()) { // ... etc
![Page 20: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/20.jpg)
Availability & SLAs
Assume your service depends on these four services:
Storage – 99.9%Network – 99.95%Compute – 99.9%Access Control – 99.9%
What is the maximum uptime your service can guarantee?
…without building extra redundancy in
99.9% * 99.95% * 99.9% * 99.9% = 99.65% (~30min/week)
![Page 21: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/21.jpg)
How is Availability Defined?Service Qualifications of Downtime
Cloud Services (compute)
“Role Instance Downtime” is the total accumulated minutes for all role instances during a billing month that had been deployed and started by action initiated by Customer which had not been running for longer than two minutes without detection and corrective action being initiated.
Storage We guarantee that at least 99.9% of the time we will successfully process correctly formatted requests that we receive to add, update, read and delete data.
“Error Rate” is the total number of Failed Storage Transactions divided by the Total Storage Transactions during a set time interval (currently set at one hour).
SQL Database SQL Database will maintain a “Monthly Availability” of 99.9% during a billing month.
A 5-minute interval is marked as unavailable if all the customer’s attempts to establish a connection to SQL Azure fail or take longer than 30 seconds to succeed, or if all basic valid read and write operations (as described in our technical documentation) fail after connection is established.
Exchange Online
Any period of time when end users are unable to send or receive email with Outlook Web Access.
![Page 22: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/22.jpg)
Monitoring the Database
On-Premises Assumption:You have access to the OSYou can collect performance countersYou can install SCOM Agents
Cloud:No access to the underlying infrastructureNo access to performance counters, because it’s a shared server
![Page 23: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/23.jpg)
How to Monitor the Database
Periodically poll the DMVsManagement Pack for SQL Azure
Build counters in to your applicationAverage SQL Connect TimeCurrent SQL Connection Failures/SecCurrent SQL Connection Retries/SecCurrent SQL Execution Retries/SecCurrent SQL Executions/SecCurrent SQL Notification Queries/Sec
![Page 24: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/24.jpg)
Monitoring the Application Tier
On-Premises Assumption:Call a TFS ‘Server Status’ web service on serverPerformance counters
Cloud:Lots of Application TiersNot directly accessible to the InternetCan’t sync status across servers (doesn’t scale)
![Page 25: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/25.jpg)
How to Monitor the Application Tier
Build events in to your application:“A request for service host XX has been executing for 34 seconds, exceeding the warning threshold of 30.”
Windows Azure DiagnosticsBuilt-in to AzurePeriodically collects perf counters, event logs, crash dumpsUploads them to Table/Blob storage
System Center Monitoring Pack for Windows Azure Apps
![Page 26: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/26.jpg)
Monitoring the End-User Experience
On-Premises Assumption:Wait for them to tell you SCOM monitors
Cloud:Many more usersLess reliable and slower networksWould probably give up, rather than say something is slow/broken
![Page 27: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/27.jpg)
Outside-In Monitoring
Synthetic transactions..Executed continuously..From key points around the world..Using typical ISP connections
System Center Global Service MonitorAgents run by MicrosoftIntegrates with System Center Operations Manager
Others: Gomez, Keynote
![Page 28: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/28.jpg)
Testing in Production (TiP monitors)
Synthetic transactions..Executed continuously..From another role in the same datacentreExercise dependent services
Continuous smoke testing of the serviceKeeps downstream providers accountableInformation to quickly diagnose an outage
![Page 29: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/29.jpg)
Diagnosing Issues
Easy problemsTFS Activity Log – keeps track of every command & parameter that a user runs
Complex problemsFine-grained tracing – controllable at runtime via database
Really hard problemsDebugging role – parallel Azure role deployment where a customer can be redirected to and a debugger can be attached
![Page 30: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/30.jpg)
Fine-Grained Tracing
![Page 31: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/31.jpg)
Separate Debugging Role
DNShttps://*.tfspreview.com/
VIP65.52.8.37
Web RoleWorker Role
Role Instance #1…n
Role Instance #1…n
New DNS Recordhttps://
sadcustomer.tfspreview.com/
VIP65.52.X.Y
Web Role
Role Instance #1
Config DBCustomer
DBCustomer
DB Attach Debugger
![Page 32: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/32.jpg)
Upgrades / Patches / Hotfixes
Users are geographically distributedNo ideal time for an offline upgrade
Can’t upgrade every customer at onceToo much loadToo much risk
How to do big “Keynote” releases?Feature flaggingTurn features on/off at runtime based upon Account, IP, etc
![Page 33: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/33.jpg)
How to Think About Upgrades
Upgrade must be an online operationDeploy one piece at a time (Schema, Services, Web)“Trickle” upgrades that migrate the data to new schema
Multiple versions must coexist peacefullyNew web binaries, old DB schemaOld clients, new server
Regular, Fixed deployment windowsEvery three weeks is a deployment opportunityIf you miss one, not too long to wait for the next oneAvoids building debt and risk
![Page 34: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/34.jpg)
Communication
Maintain (and build) trust during an outage:Immediately: “Yes, there’s a problem. We’re working on it”Regularly: “Still working on it, going to do <x>”After: “Root cause was <y>. It’s not going to happen again, because we’ve done <z>”
![Page 35: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/35.jpg)
Team Structure Matters!Dev, Test & Ops together
Expect FailuresHandle all failures gracefully
Most Problems Have Been SolvedYour job is to find and bring those solutions together
Summary
![Page 36: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/36.jpg)
Related Content
Planning for Failure in Cloud Applications (AZR333 - Fri 11:30)
Exploring Windows Azure Storage (AZRILL102 - Fri 11:30)
Research Paper (http://aka.ms/InternetScaleServices)
Exam 70-583: Designing and Developing Windows Azure Applications
Find Me Later at the Speaker Lounge (12:45 – 1:45)
![Page 37: From Server to Service: How Microsoft moved Team Foundation Server to Windows Azure Grant Holliday Senior Premier Field Engineer AZR323b](https://reader030.vdocuments.us/reader030/viewer/2022032722/56649cee5503460f949bb756/html5/thumbnails/37.jpg)
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the
part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.