azure large scale deployments - tales from the trenches
TRANSCRIPT
CLD334aAaron Saikovski Specialist Solution Architect – Microsoft Cloud TechnologiesRackspace AustraliaT: @RuskyDuck72 E: [email protected]
Deploying Complex and Large Scale Azure Environments –Tales from the Trenches
Agenda
Quick Intros
Large Scale Deployments
Subscriptions
Tagging
Storage
Networking
Automation
Monitoring
Questions
About me
SubscriptionsOne Subscription per environment -> Dev, Test, Prod
MSA and AzureAD Accounts -> subscriptions
Enterprise Agreement (EA) - > Consolidated billing
Restrict access to Prod (Yes Devs we are looking at you )
TIP#1: Use named accounts (AzureAD) instead of MSA and use MFA!!!
TIP#2: Use billing alerts at the subscription level to manage spend
Subscriptions
Source: https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits#subscription-limits
Key Subscription Limits
TaggingKey:Value pairs -> name resources
Link resources -> cost centre, business unit etc
Group common resources
Resource -> 15 tags Max.
Names -> Max. 512 characters
Value ->Max. 256 characters.
Tagging..contExamples:
Environment: Dev, Test, Prod
Build date
Cost centre
Owner
Azure “Classic” mode doesn’t support tagging
TIP#3: Automated shutdown of resources without tags. Save $$$
Tagging
Source: https://docs.microsoft.com/en-us/azure/azure-resource-manager/resource-group-using-tags
Quick Storage Recap
Source: https://docs.microsoft.com/en-us/azure/storage/storage-redundancy
Storage AccountsDon’t overload storage accounts
Plan Pricing Tiers -> Performance
Premium storage -> Production workloads
Avoid single storage accounts
Standard storage -> MAX 500 IOPs per disk
Premium -> MAX 5000 IOPS per disk (P30)
TIP#4: Enable encryption when provisioning. Not after!
Storage Account NamingNaming of storage accounts -> Storage load balancing
Eg. ‘devstorageacct001’, ‘devstorageacct002’
Traffic bound to a partition server -> Rebalance -> performance hit!
Can have a big performance hit on VM workloads
TIP#5: Prefix storage accounts with a 3 digit hash (Unique)
Source: https://docs.microsoft.com/en-us/azure/storage/storage-performance-checklist
Storage Account Naming
Same cluster
Unique cluster
Managed Disks GA Announced Feb 8th 2017!
Removes storage account scale management
Easy migration path
Massive scale set support – 1,000 VMs
2000 managed disks per subscription
RBAC roles on disks
Managed Disks -> LRS only
Late Breaking!!!
Networking
Planning!!!
Overlapping IP ranges -> ExpressRoute, S2S VPN
Deploy and Redeploy -> Iterate
Keep it simple
Single VNet vs VNet Peering
GatewaySubnet -> /27 Address Space
TIP#6: Avoid Network Security Groups (NSGs) at the NIC level
Network Security Groups (NSGs)
Recommended!!
AutomationAutomate everything -> ARM, PowerShell, CLI
No manual changes
ARM is incremental
Tag resources
Resource groups & Tags for cost optimisation
Layer the deployment
Automation..contStore ARM templates in a private repository
Linked templates vs. layered ARM templates
Azure Automation for scheduled tasks
TIP#7: Keep your Azure PowerShell and SDK tools up to date
TIP#8: Lock ResourceGroups with ‘CanNotDelete’ lock level
TIP#9: Don’t store passwords in .param files -> use KeyVault!!
Azure
Automation
Bonus Tip: Staggered Automation runbook schedules -> PowerShell
Automation..Tips and TricksUse "location": "[resourceGroup().location]" as default resource location
Use subscription().id, resourceGroup().id for unique identifiers in variables
Use listKeys for dynamic value lookups:
…"[listKeys(resourceId('Microsoft.Cache/Redis', parameters('redisCacheName')), '2014-04-01').primaryKey
Automation..Tips and Tricks..contUse outputs for debugging:"outputs": {
"RedisSessionStateHost": {
"type": "string",
"value": "[concat(parameters('redisCacheName'),
•'.redis.cache.windows.net')]"
}
}
MonitoringOMS (Log Analytics) -> default used by Rackspace
Support -> subscription level
Lots of metrics are captured
Automated alerting -> Support ticket
Example Key VM metricsMalware signatures update status
Realtime protection
CPU average greater than 95 percent average over 5 minutes
Operating System Disk C = has less than 500 MB free space
Recovery vault backup failures
Monitoring..contInclude PaaS workloads – App Services, DocDB etc
AppInsights -> URL monitoring -> multiple test locations
Webhooks -> Azure Functions -> OMS Ingestion
TIP#10: OMS has a 15 minute indexing interval
OMS Query SamplesARM Deployments:
Type:AzureActivity AND (OperationName="Microsoft.Resources/deployments/write" OR OperationName="Microsoft.Resources/deployments/validate/action") | measure count () by ResourceId, ResourceGroup
Malware signatures out of date:
Type=ProtectionStatus AND (ProtectionStatusRank=250) AND (TypeofProtection="System Center Endpoint Protection")
OMS Query Samples..contSQL Azure: Average CPU utilization percentage greater than 80% over 10 minutes:
Type=sqlazure_CL MetricName_s=cpu_percent | measure max(Average_d) as DBCPU by DatabaseName_s interval 10minutes | where DBCPU >=80
Key Takeaways
TIP#1: Use named accounts (AzureAD) instead of MSA and use MFA!!!
TIP#2: Use billing alerts at the subscription level to manage spend
TIP#3: Automated shutdown of resources without tags. Save $$$
TIP#4: Enable encryption when provisioning. Not after!
TIP#5: Prefix storage accounts with a 3 digit hash (Unique)
TIP#6: Avoid Network Security Groups (NSGs) at the NIC level
TIP#7: Keep your Azure PowerShell and SDK tools up to date
TIP#8: Lock ResourceGroups with ‘CanNotDelete’ lock level
TIP#9: Don’t store passwords in .param files -> use KeyVault!!
TIP#10: OMS has a 15 minute indexing interval
Complete your session evaluation on MyIgnitefor your chance to WIN one of many daily prizes.
(image of prizes tbc)
Session evaluation
Visit Channel 9 to access a wide range of Microsoft training and event recordings https://channel9.msdn.com/
Head to the TechNet Eval Centre to download trials of the latest
Microsoft products http://Microsoft.com/en-us/evalcenter/
Visit Microsoft Virtual Academy for free online training visit
https://www.microsoftvirtualacademy.com
Continue your Ignite learning path
CLD334aAaron Saikovski Specialist Solution Architect – Microsoft Cloud TechnologiesRackspace AustraliaT: @RuskyDuck72 E: [email protected]
Deploying Complex and Large Scale Azure Environments –Tales from the Trenches
Microsoft Ignite