ses314. availability %downtime / year downtime / month downtime / week 99%3.65 days7.20 hours1.68...
TRANSCRIPT
SharePoint Business Continuity Management
Neil Hodgkinson SES314
Neil HodgkinsonPre-MicrosoftCSC SharePoint Specialist – 5 YearsProcess Chemist (Drugs, Poisons and Explosives) – 3 Years
Microsoft (2005-)SharePoint PFE - 5 YearsSharePoint Service Engineering O365 - 3 YearsSharePoint Product Group - CurrentOffice 365 CAT Team
ContactEmail – [email protected] - @nellymo
AgendaDefinitionsTechnology OverviewOffice Web Apps & Workflow ManagerSharePoint High AvailabilitySharePoint Warm StandbySharepoint Cold Recovery
Take AwaysUnderstand the concepts of Business Continuity and the implications for SharePointDifferentiate between High Availability and Disaster RecoveryGain a deeper understanding of the available techniques for implementing HA/DR for SharePoint
DefinitionsHigh AvailabilityDisaster RecoveryStretched Farms
High Availability
“High Availability is a system design approach and associated service implementation that ensures a prearranged level of operational performance will be met during a contractual measurement period.”
Wikipedia
High Availability is about protecting the Service Level Agreements (SLAs) and the agreed Fault Domains
S
Service Level Agreements
Agreed levels of service usually between vendors, suppliers and clients or inter organisational departments
Availability % Downtime / Year
Downtime / Month
Downtime / Week
99% 3.65 days 7.20 hours 1.68 hours
99.9% 8.76 hours 43.20 minutes 10.10 minutes
99.99% 52.56 minutes 4.32 minutes 1.01 minutes
99.999% 5.26 minutes 25.90 seconds 6.05 seconds
99.9999% 31.50 seconds 2.59 seconds 0.61 seconds
Availability % Downtime / Year
Downtime / Month
Downtime / Week
99% 3.65 days 7.20 hours 1.68 hours
99.9% 8.76 hours 43.20 minutes 10.10 minutes
99.99% 52.56 minutes 4.32 minutes 1.01 minutes
99.999% 5.26 minutes 25.90 seconds 6.05 seconds
99.9999% 31.50 seconds 2.59 seconds 0.61 seconds
Fault Domains“Fault Domain is the group of physical infrastructure pieces with a common configuration that share a single point of failure.”
What am I protecting against?DatacentreRackHost ServerPower SupplyNetwork CardVirtual ServerService Instance
Building the Franken Rack
Lack of Redundant PowerLack of Redundant Network Connectivity
Building the Franken Rack
The Datacenter People Hate You
Building the Franken Rack
What about what’s inside?
Fault Domain Fault Domain
Defining Fault Domains
Fault Domain Fault Domain
Disaster Recovery“Disaster Recovery (DR) is the process, policies, and procedures that are related to preparing for recovery or continuation of technology infrastructure which are vital to an organization after a natural or human-induced disaster.”
Wikipedia
Defining RequirementsRecovery Point Objective (RPO)Acceptable amount of data loss measured in time
Recovery Time Objective (RTO)Duration of time within which a business process must be restored after a disaster
RPO RTO
Example:RPO of 1 hourRTO of 3 hours
“I can lose 60 minutes worth of data, and all of my data can be inaccessible for three hours.”
RPO/RTO versus Cost
RPO/RTO
COST
Datacentre BDatacentre A
SharePoint Farm
Stretched Farms
< 1ms
Stretched FarmsOriginally NOT supported for SP2013“Physical” Data CentreSupported as of April 2013
“Logical” Data Centre
For a stretched farm architecture to work as a supported high-availability solution, the following prerequisites must be met:
There is a highly consistent intra-farm latency of <1ms, 99.9% of the time over a period of ten minutes. (Intra-farm latency is commonly defined as the latency between the front-end web servers and the database servers.)
The bandwidth speed must be at least 1 gigabit per second.
http://technet.microsoft.com/en-us/library/cc262485(v=office.15).aspx#hwLocServers
Monitoring SQL Latency$prov = Get-SPUsageDefinition -Identity "SQL Latency Usage"$prov.Enabled = $true$prov.Update()
SELECT [MachineName], [DataSource], [InitialCatalog], [Count], [Percentile99], [MaxLatency]FROM SQLLatencyDataWITH(NOLOCK)WHERE PartitionId = @PartId AND (LogTime > CONVERT(nvarchar(50), @StartDate + ' ' + @StartTime)) AND (LogTime < CONVERT(nvarchar(50), @EndDate + ' ' + @EndTime))
NH
High MaxLatency + high Percentile99 count requires investigation
Demo : Monitoring Intra Farm Latency
How do I do that ?
TechnologiesFailover ClusteringDatabase MirroringLog ShippingAlwaysOn Availability Groups
Failover Cluster
Failover Clustering
Server Hardware Redundancy
Uses a shared disk subsystemEntire instance fails over as a unit
Database MirroringSynchronous, high-availability configuration Data is Mirrored as part of a TransactionCompressed Log StreamAuto Page Repair
Database MirroringASynchronous, performance configuration
Increased PerformanceDistance no longer a barrier
Database Mirroring SummaryCost-effectiveNo Specialised Hardware is requiredStraightforward setup and administrationWorks at Database Level not InstanceHiccup while it fails overApplications Mirror Aware
Log Shipping
Transaction log backups sent from primary to secondariesApplied to each secondary databases
Multiple Secondaries
…
Failover ClusterFailover Cluster
Putting it all together!!
Failover ClusteringLocal server redundancy
Database MirroringPrimary disaster site for databases
Log ShippingAdditional disaster sites for databases
AlwaysOn Availability Groups
SQL Server 2012“Kind-of” Clustering and MirroringSync-CommitUp to three
Async-CommitFailover Cluster
Clustered Resource
RPO/RTO Options
Zero Seconds Minutes Hours Days Weeks
Recovery Point Objective
Reco
very
Tim
e O
bje
ctiv
e
Mirroring - Sync
AlwaysOn - Async
Failover Clustering
Backup/Restore
Mirroring - Async
Log Shipping
AlwaysOn - Sync
Office Web App FarmNothing is persistedConfiguration Settings
Office Web App
Office Web AppsNLB
SharePoint FarmNLB
OWA – Disaster RecoveryOptionsStandby farm local to SharePoint primary farmStandby farm in remote datacentreProcessDisconnect SharePoint farm from OWA farm
Remove-SPWOPIBinding –All:$trueAttach SharePoint farm to Standby farm
New-SPWOPIBinding -ServerName <WacServerName>
Workflow Manager FarmsOne or Three ServersDependencies on the Service Bus
Disaster RecoveryComplex!!!Requires creating a new Farm from SQL
Hot StandbyHA Configuration!!!Shortest possible RTO/RPO
Demo : High Availability
SQL Server 2012 AlwaysOn Availability Groups
Demo Environment - Start
SQL 1
FARM1
SQL 2
Demo Environment - End
Failover Cluster
Clustered Resource
SQL 1
FARM1
SQL 2
Warm StandbyComplex!Depends on Service Applications and associated Databases
Failover Processes• Planned• Unplanned
Demo :Warm Standby Farms
SQL Server 2012 AlwaysOn Availability Groups
Demo Environment - Start
SQL 1
FARM 1
SQL 2
FARM 2
SQL 3
Clustered Resource
ProductionAuckland
DRWellington
Failover Cluster
Demo Environment - End
SQL 1
FARM 1
SQL 2
FARM 2
SQL 3
ProductionAuckland
DRWellington
Failover Cluster
Clustered Resource
Demo SummaryDemo 1Create WFSC Cluster – Already doneCreated Always On Group – Added 3 SQL NodesCreate ListenerDrop ContentDB and Repoint to listenerFailover database to prove connection
Demo 2Discuss the listener and its useBuilt DR Farm with standby database – Sample site collectionRepoint prod are Primary and Secondary HA SQL ServersFailover to Async DR ReplicaDrop DR Standby DB and Reconnect Prod replicated DBShow bringing the DR farm onlineRepoint DNS after testing
Cold StandbySharePoint Farm Backup and Restore
Native ToolsThird Party Tools
SharePoint 2013 Backup and Recovery with DPM 2012
Limited to Content Databases and SharePoint Configuration
SummaryHA versus DRRPO, RTO, SLAs, and Fault DomainsHot StandbyWarm StandbyCold Standby
Related contentBreakout Sessions – WednesdayTime Code Title Speaker(s)
9:00am – 10:00am
SES201
Overview of Enterprise Social from Microsoft
Paul Quirk, Adam Pisoni
10:40am – 11:40am
SES202
Configuring your SharePoint 2013 Farm for Apps
Mark Rhodes
11:55am – 12:55pm
SES102
The Social Intranet! Debbie Ireland
1:55pm – 2:55pm
SES204
That's so 2013! - An overview of Web Content Management In SharePoint 2013
Jacques Botha
3:10pm – 4:10pm
SES314
SharePoint Business Continuity Management
Neil Hodgkinson
4:30pm – 5:30pm
SES306
SharePoint Forensic Deep Dive Mark Rhodes
Related contentBreakout Sessions – ThursdayTime Code Title Speaker(s)
9:00am – 10:00am
SES307
Real world SharePoint 2013 architecture decisions
Wictor Wilén
10:40am – 11:40am
SES308
SharePoint Search is Dead, Long Live SharePoint Search
Neil Hodgkinson
11:55am – 12:55pm
SES309
JavaScript in SharePoint and not just for Apps
Wictor Wilén
1:55pm – 2:55pm
SES310
Customising the Search Experience in SharePoint 2013
Wayne Ewington
3:10pm – 4:10pm
SES311
Upgrading, Deploying and Scaling out SharePoint Search : What No UI!
Neil Hodgkinson
4:30pm – 5:30pm
SES312
Optimising SQL Server for SharePoint Pat Martin, Wayne Ewington
5:45pm – 6:45pm
SES315
SharePoint Stump the Experts : Panel Discussion
Panel Speakers
Related contentBreakout Sessions – FridayTime Code Title Speaker(s)
9:00am – 10:00am
SES313
Mastering Office Web Apps Server 2013 operations
Wictor Wilén
10:40am – 11:40am
SES305
Upgrading to SharePoint 2013 Wayne Ewington
Related contentRelated Certification Exams
Code Title
70-480 Programming in HTML5 with JavaScript and CSS3
70-486 Developing ASP.NET MVC 4 Web Applications
70-488 Developing Microsoft SharePoint 2013 Server Core Solutions
70-489 Developing Microsoft SharePoint 2013 Server Advanced Solutions
Code Title
70-410 Installing and Configuring Windows Server 2012
70-411 Administering Windows Server 2012
70-412 Configuring Advanced Windows Server 2012 Services
70-331 Core Solutions of Microsoft SharePoint 2013 Server
70-332 Advanced Solutions of Microsoft SharePoint 2013 Server
Related contentFind Me Later At...
Stump the ChumpsSearch or other SharePoint related sessionsBarFloating around
Evaluate this session and you could win instantly!
Head to...aka.ms/te
© 2013 Microsoft Corporation. All rights reserved.Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.