datacenter switchover - v1.2

Post on 10-Nov-2014

54 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Workflow Steps

Perform a datacenter switchover for a database availability group

Version 1.2 (Updated 12/2012)

Exchange 2010 - Datacenter Switchover

Stop-DatabaseAvailabilityGroup

Restore-DatabaseAvailabilityGroup

Start-DatabaseAvailabilityGroup

Exchange 2010 - Datacenter Switchback

Stop-DatabaseAvailabilityGroup

Has the datacenter switchover been approved?

YES

NO

Stop-DatabaseAvailabilityGroup

Is the primary datacenter online or physically accessible?

YES

NO

Stop-DatabaseAvailabilityGroup

Do the remote and primary datacenters have network connectivity?

YES

NO

Stop-DatabaseAvailabilityGroup

Are the Exchange servers in the primary datacenter online?

YES

NO

Stop-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

Repeat the above command for all Active Directory sites containing DAG members that are not the recovery datacenter AD site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc.

3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site>

Repeat the above command for all DAG members that are not in the recovery datacenter.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc.

3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

-ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True

Repeat command for all DAG members that are not in the recovery datacenter.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroup

Are the Exchange servers in primary datacenter online?

YES

NO

Stop-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

-ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True

Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Optional: If Exchange Management Shell access to the primary datacenter is available, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) Exchange servers that were accessible in the primary datacenter should have their Cluster services forcibly cleaned up and the Cluster service should be configured with a startup type of DISABLED. You can verify this using Services.msc.

3) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

No Exchange server instance if functional to service the Exchange Management Shell – in this instance this step can be skipped.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site>

Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary datacenter> -ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

-ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:OPTIONAL: Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

-ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists :

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroupCOMMANDS:Optional: If Exchange Management Shell access to the primary datacenter is available, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -configurationOnly:$TRUE

Repeat for any additional DAG members that are not in the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:

1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG (this assumes at least one Exchange server exists :

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:

If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Stop-DatabaseAvailabilityGroupCOMMANDS:Optional: If Exchange Management Shell access to the primary datacenter is available, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

-ConfigurationOnly:$True

Repeat for any additional Active Directory sites that are not the recovery datacenter Active Directory site.

EXPECTED OUTCOMES:1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

2) A double write to both a domain controller in the recovery datacenter and a domain controller in the primary datacenter of the StoppedMailboxServers attribute is performed. This is done to bypass Active Directory site replication latency.

COMMON ERRORS:If a domain controller in the primary datacenter is not available, the command may return an Active Directory provider error. This error can be safely ignored.

Command Completed?

Stop-DatabaseAvailabilityGroup

COMMANDS:

Using the Exchange Management Shell on a sever in the recovery datacenter, run:

Stop-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site> -ConfigurationOnly:$True

Repeat command for all DAG members that are not in the recovery datacenter.

EXPECTED OUTCOMES1) Verify the servers on the StartedMailboxServers and StoppedMailboxServers lists for the DAG:

Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

The StoppedMailboxServer should list all mailbox servers in the primary datacenter and the StartedMailboxServers should list all mailbox servers in the recovery datacenter.

Command Completed?

Restore-DatabaseAvailabilityGroup

Did Stop-DatabaseAvailabilityGroup complete successfully?

YES

NO

Restore-DatabaseAvailabilityGroup

COMMANDS:

Stop the Cluster service on each DAG member in the recovery datacenter. To do this run the appropriate command for your DAG member’s operating system:

• Windows Server 2008 R2: Stop-Service Clussvc• Windows Server 2008 SP2: Net Stop Clussvc

EXPECTED OUTCOMES:

Cluster services are stopped on remaining nodes.

COMMON ERRORS

Access denied – You must use an elevated command prompt run as administrator if the default administrator account is not used

Command Completed?

Restore-DatabaseAvailabilityGroup

Is the Cluster service stopped on all DAG members in your recovery datacenter?

YES

NO

Restore-DatabaseAvailabilityGroupCOMMANDS:From the Exchange Management Shell on an Exchange server in the recovery datacenter, run:

Restore-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <recovery site> -AlternateWitnessDirectory:<AWSPath> -AlternateWitnessServer:<AWSName>

EXPECTED OUTCOMES:1) A DAG member in the recovery datacenter is randomly selected and it’s Cluster service is started in /forceQuourm mode2) DAG members on the StoppedMailboxServers list are evicted from the DAG’s cluster thereby adjusting the membership count

a) If the resulting membership count is EVEN or results in a SINGLE node, the Cluster is configured with a Node and File Share Majority quorum and it begins using the Alternate Witness Server and Alternate Witness Directory

3) Cluster services are started on the remaining DAG members and they successfully join the DAG’s cluster

VERIFICATION:Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands:

Windows Server 2008 R24) Import-Module FailoverClusters5) Get-ClusterNode –Cluster <DAGName>6) Get-ClusterGroup –Cluster <DAGName>

Windows Server 2008 SP27) Cluster <DAGName> node2) Cluster <DAGName> group

COMMON ERRORS:Nodes fail to evict with error 0x46. See http://aka.ms/0x46

Command Completed?

Restore-DatabaseAvailabilityGroup

Assuming all pre-requisites have been met, any activation blocks can now be removed and databases can be mounted

Command Completed?

Start-DatabaseAvailabilityGroup

Is your primary datacenter online?

YES

NO

Start-DatabaseAvailabilityGroup

Ensure that supporting services are available including but not limited to:

1) Active Directory / domain controllers / global catalog / FSMO role holders

2) Domain Name Services (DNS)

3) Witness Server

4) Supporting Exchange roles: Client Access and Hub Transport

OPTIONAL:

Dynamic Host Configuration Protocol servers (DHCP), if DHCP addresses are used for DAG networksEdge Transport serverUnified Messaging server

Continue…

Start-DatabaseAvailabilityGroup

Are the necessary services established and functioning?

YES

NO

Start-DatabaseAvailabilityGroup

COMMANDS:

Verify network connectivity between all DAG members.

Suggested methods:

1) Ping test between DAG members2) Map administrative shares between DAG members

EXPECTED OUTCOMES:

Connectivity between datacenters is functioning and all cluster inter-node communications are operating normally

Command Completed?

Start-DatabaseAvailabilityGroup

Have datacenter communications been verified?

YES

NO

Start-DatabaseAvailabilityGroup

Verify that Cluster service on the DAG members in the primary datacenter have a startup type of DISABLED. If they do not, either the Stop-DatabaseAvailabilityGroup command was not successful or the DAG members in the primary datacenter failed to receive eviction notification after network connectivity between datacenters was restored

Do not proceed until Cluster service cleanup has occurred and Cluster service has a startup type of DISABLED.

You can optionally run the following command on the DAG members in the primary datacenter to forcibly cleanup the outdated cluster information:

Cluster node /forcecleanup

Continue…

Start-DatabaseAvailabilityGroup

Does the Cluster service show a startup type of disabled?

YES

NO

Start-DatabaseAvailabilityGroup

Is your DAG extended to multiple Active Directory sites?

YES

NO

Start-DatabaseAvailabilityGroupCOMMAND:Using the Exchange Management Shell, run the following command:

Start-DatabaseAvailabilityGroup –Identity <DAGName> -ActiveDirectorySite <primary site>

Repeat for all other Active Directory sites that were stopped during the datacenter switchover process.

EXPECTED OUTCOMES:1) DAG members in the primary datacenter are added to the DAG’s cluster2) If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum

VERIFICATION:Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands:

Windows Server 2008 R23) Import-Module FailoverClusters4) Get-ClusterNode –Cluster <DAGName>5) Get-ClusterGroup –Cluster <DAGName>

Windows Server 2008 SP26) Cluster <DAGName> node2) Cluster <DAGName> group

The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list:Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

COMMON ERRORS:Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again.

Continue…

Start-DatabaseAvailabilityGroupCOMMAND:Using the Exchange Management Shell, run the following command:

Start-DatabaseAvailabilityGroup –Identity <DAGName> -MailboxServer <DAG member in primary site>

Repeat for all other Mailbox servers that were stopped during the datacenter switchover process.

EXPECTED OUTCOMES:1) DAG members in the primary datacenter are added to the DAG’s cluster2) If the resulting membership count is EVEN, the cluster is to use the Node and File Share Majority quorum

VERIFICATION:Use the following steps to verify that the DAG members are up and the Cluster Group is online by running the following commands:

Windows Server 2008 R23) Import-Module FailoverClusters4) Get-ClusterNode –Cluster <DAGName>5) Get-ClusterGroup –Cluster <DAGName>

Windows Server 2008 SP26) Cluster <DAGName> node2) Cluster <DAGName> group

The following command shows the StartedMailboxServers list with all DAG members and an empty StoppedMailboxServers list:Get-DatabaseAvailabilityGroup –Identity <DAGName> | FL

COMMON ERRORS:Nodes may fail to join the cluster with invalid node error. If this occurs, retry the command again.

Continue…

Start-DatabaseAvailabilityGroup

Were the DAG members added to the cluster successfully?

YES

NO

Start-DatabaseAvailabilityGroup

Were the DAG members added to the cluster successfully?

YES

NO

Start-DatabaseAvailabilityGroup

COMMANDS:Reset the DAG’s Witness Server and Alternate Witness Server properties by running the following command:

Set-DatabaseAvailabilityGroup –Identity <DAGName> -WitnessServer <WSName> -AlternateWitnessServer <AWSName>

EXPECTED OUTCOMES:

Witness Server and Alternate Witness Server properties are configured to ensure the appropriate witness server is in use

If the Cluster configuration does not match the DAG configuration, the Cluster is updated with the proper configuration

COMMON ERRORS:

Administrators incorrectly verify which file share witness is currently in use. See http://aka.ms/E14FSW.

Continue…

Start-DatabaseAvailabilityGroup

After any activation blocks have been removed, active database copies can be moved to servers in the primary datacenter

Continue…

top related