exchange server 2010 & 2013: disaster recovery – troubleshooter v.1.0

28
Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

Upload: branden-terry

Post on 23-Dec-2015

226 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

Exchange Server 2010 & 2013:Disaster Recovery – Troubleshooterv.1.0

Page 2: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

4

Instructions

How to use this tool?

2. Select the kind of issue occurring

3. Follow the instructions for each scenario

1. Switch to “Slide Show”

Page 3: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

5

Scope of issue:

Mailbox

Database

Exchange ServerExit

Page 4: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

6

Mailbox level issuesSymptoms and common causes:• Item count issues or items’ size vs mailbox size;• Items “disappearing” (e.g.: meetings, contacts, e-mails);• Items are being duplicated;• OOF (Out-Of-Office) showing unusual behavior and/or errors;• Outlook AND OWA showing errors, during mailbox access or folder

navigation;• Corruption of “old” items (e.g.: e-mails previously read that cannot

be opened);• Pre-defined searches not working anymore.• Display name corruption for items/folders.

Symptoms match?YES NO

Page 5: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

7

Mailbox level issuesTroubleshooting:• Using Exchange Management Shell (EMS):• New-MailboxRepairRequest -Mailbox <Affected_Mailbox> -CorruptionType

<SearchFolder,AggregateCounts,FolderView,ProvisionedFolder>

Expected results:At Mailbox Server holding the mailbox, the “Application Event Viewer” should show:EventID: 10047 – Source: MSExchangeIS Mailbox Store – Starting the repair process.EventID: 10048 – Source: MSExchangeIS Mailbox Store – Informing the end of repair process.EventID: 10049 – Source: MSExchangeIS Mailbox Store – Informing the end of repair process and repaired objects report.

Symptoms persists? NOYES

Page 6: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

8

Mailbox level issuesTroubleshooting:Get mailbox statistics, through EMS:• Get-MailboxFolderStatistics <Affected_Mailbox> |fl Identity,ItemsInFolder,FolderSize

-AutoSize

• Through EMS, try to move mailbox, forcing to remove logical corruption:• New-MoveRequest -Identity <Affected_Mailbox> -BadItemLimit <0 a 50>

-TargetDatabase <different_database> Expected results:• Check “MoveRequest Report” (Go to Exchange Magement Console (EMC) > Recipient

Configuration > Move Request; or, via Powershell: Get-MoveRequest).• Warning: If logical corruption happened, it is possible to lose affected data.• Note: It is possible to recover “MoveHistory”, even after move report was removed:• $MoveReport = (Get-MailboxStatistics -Identity `mailbox' -IncludeMoveReport).MoveHistory• $MoveReport > path\history_file_name.txt

Symptoms persists?YES NO

Page 7: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

9

Database level issuesSymptoms and common causes:• The same type of issues already listed for mailbox level, though

affecting several (or all) mailboxes within a database (also called DB);

• Database won’t mount, after Information Store crash (possible logical corruption);

• Database dismounted, and won’t mount;• Database states “dirty shutdown” during *.edb check via ESEUTIL /

MH;• Database states “dirty shutdown” during *.log check via ESEUTIL

/ML;• Database states “dirty shutdown” and logs are “disappearing”

(check Antivirus).• Database states “clean shutdown” AND “log required” via ESEUTIL

/MH (and vice-versa).Symptoms match?

YES NO

Page 8: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

10

Database level issuesAffected database is protected by DAG (Database Availability Group)?

YES NO

Page 9: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

11

Database level issuesDatabase protected by DAG:

• It is possible that the Database copy is already mounted in another DAG member, as long as the copy was in “Health” state.

• However, DAG could suffer a failure that avoids databases to mount, forcing administrators to rebuild DAG copy through restore, or in the worst cases, force the copy still running on a healthy server, to mount, affording to lose data.

• There are, literally, dozens of factors that can cause this kind of scenario, therefore our approach is to discuss the most common scenarios, and how to fix each one.

Possible action plans:Rebuild copy

FailbackForce

mountingRebuild

DB IndexDial-Tone

Page 10: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

12

Database level issuesDial-Tone Database:

• Check database and log paths, through EMS: Get-Mailbox <Affected_DB> |fl *path*• Check “EDBFilePath” and “LogFolderPath” and be sure there is no remaining files on those locations (Better move files to secure location,

instead of delete this set of files).• Force database to mount (via EMC or EMS): Mount-Database <Affected_DB>. Accept the creation of new log and EDB files.• When the original DB is recovered, change the EDB’s, by dismounting the current (dial-tone) and moving it to a safe location, then replace it

with the recovered EDB (or, simply overwrite it, using the back-up tool, after the dial-tone DB has been already copied to a safe location).

• Merging data of dial-tone and production EDB’s:• New-MailboxDatabase -Name “Recovery_DB” (could be another meaningful name) -Server <Recover_Server> (on the same server) -

EDBFilePath <“path+name.edb”> -LogFolderPath <“path_logs”> -Recovery (it will configure this new DB in “recovery mode”).• Mount-Database <Recovery_DB> (it will mount the DB configured on prior step).• Configure the production DB to allow restore: Check “This Database can be overwritten by a restore”, at “Maintenance” tab of production

database, through EMC or using EMS.• Via EMS, execute: Get-MailboxStatistics -Database Recovery_DB | Restore-Mailbox -RecoveryDatabase Recovery_DB• After this cmdlet, check if Outlook is not showing “Maintenance warnings”, and if it is already presenting all the data (recovered from backup,

but also, dial-tone data). There is not a warning message at OWA, so it is best to test it through Outlook to check whether operation succeed.• The approach of keeping dial-tone mounted as a production database, and merging data from database recovered (e.g.: by restore from

backup) will cause permanent Outlook pop-ups about “Maintenance mode”. If this approach is adopted, the only way to fix it is by recreating the Outlook Profile in each machine, displaying the message.

Symptoms persists?

YES NO

Page 11: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

13

Database level issuesStandalone Database (no DAG):

• This type of database is not ready to take failover actions.• There is, at least, three ways to recover a standalone database that

fails, including log sequence verification, through the need to restore from back-up.

• We are going to discuss the most common procedures to recovery standalone databases.

Possible action plans:

Check logs and

EDBESEutil /P

Replay Logs

Page 12: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

14

Database level issuesCheck logs and EDB:

• Check Windows Event Viewer, in “Application” section for “ESE” source events.• Check disk space on the paths used for logs & EDB.• If there are no abnormalities during the routines above, it is time to check EDB:• Elevate CMD, the access the file path to EDB and execute ESEUTIL /MH against the

file:• Note down the values for the fields “State” and “Log Required”;• “State” can display “Clean” or “Dirty Shutdown”.• “Log Required” can display any value from “0-0” (no log required), to a series of

required logs.• Any Database which is “State” is equal to “Clean Shutdown”, is technically ready to

be mounted, even if all logs are lost. However, some serious kinds of physical corruption can render a DB in “Clean State”, that cannot be mounted, with several errors.

Next

Page 13: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

15

Database level issuesCheck logs & EDB:

• Load elevated CMD, access path folder, and execute the ESEutil /ML at generation sequence:• Example: e:\Db1\Logs\> ESEutil /ML E00 (“E00” the standard for new DBs, although this value can change).• A list of log sequence and the state of each log is displayed. The States could be “OK”, “Missing”, or “Error:”

(example):• E0000000001.log – OK• E0000000002.log – OK• E0000000003.log – OK• E0000000004.log – OK• E0000000005.log – OK• E0000000006.log – OK• E0000000007.log – OK• E0000000008.log – OK• E0000000009.log – OK• E000000000A.log – OK• E000000000B.log – OK• (...)

Symptoms match?NOYES

Page 14: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

16

Database level issuesCheck logs & EDB:

• If “State” presents “Dirty Shutdown”, and “Log Required” points to any other value than “0-0” (expected), it will be necessary to find out the logs missing. Example:

• DB1 State “Dirty Shutdown” – Log Required “0x1 – 0x2”• To identify the corresponding log generation file, open an elevated CMD,

and execute: ESEutil /ML e04.log (example). There is a field called “LGeneration” that provides the formation sequence of this particular log, corresponding to “Log Required” field, presented at database command.

• If every .log file required at “Log Required” field is present and healthy, we can follow the “Replay” process.

Symptoms persists?NOYES

Page 15: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

17

Database level issuesReplay Logs process:

• If log sequence and EDB was successfully validate, it is time to log replay:• Through elevated CMD, access path for logs.• Execute “ESEutil /R E04” (as we discussed before, this value can be

different. Check the prefix name, used at every log file for a “tip” or use ESEUTIL /MH to find out).

• This command identifies the path to EDB and apply the logs required by DB, just after checking again for log integrity and sequence.

• At the end, if no errors were detected, the EDB will display “State” = “Clean Shutdown”, upon ESEUTIL /MH execution.

• After this, we are ready to mount the database, dismissing any specific parameter.

Symptoms persists?NOYES

Page 16: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

18

Database level issuesESEutil /P:

• ALWAYS, the last resort (recommended after attempts to fix with Microsoft Support representatives have failed).

• Implies loss of data.

• Open an elevated CMD and access the path to EDB.• Always do a secure copy of the EDB, prior /P execution. Execute: ESEutil /P db1.edb

(example).• After this process, we are going to get an EDB in “Clean Shutdown” state. Yet, it is not

logically consistent.• As “ISInteg” tool is now deprecated, we have to use EMS cmdlets for fix this:• New-MailboxRepairRequest -Database <path_for_DB_after_ESEutli/p> -CorruptionType

<SearchFolder,AggregateCounts,FolderView,ProvisionedFolder>

Symptoms persists? NOYES

Page 17: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

19

Database level issuesRecreate copy:• At the server where the DB first crashed (and now is acting as the

passive copy):• Suspend-MailboxDatabaseCopy -Identity <DB_Name\Server_Name> 

• Executing “Full reseed”:• Update-MailboxDatabaseCopy -Identity <DB_Name\

Healthy_Copy_Server_Name> -DeleteExistingFiles• This process can spend a long time, varying due to database size.

Symptoms persists?YES NO

Page 18: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

20

Database level issuesRecreating Content Index for a DAG database:• At the server presenting the issue for Content Index:• Suspend-MailboxDatabaseCopy -Identity <DB_Name\Server_Name> 

• Regenerating Content Index:• Update-MailboxDatabaseCopy -Identity <DB_Name\Server_Name> -

CatologOnly• This process can take a long time, varying due to database size.

Symptoms persists?YES NO

Page 19: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

21

Database level issuesForced mounting:• It is possible, though uncommon, to suffer loss of data.• On affected server, where forced mounting will be attempted:• Move-ActiveMailboxDatabase -Identity <DB_Name> -ActivateOnServer

<Sever_Name> -MountDialOverride "BestEffort" -SkipActiveCopyChecks -SkipLagChecks -SkipClientExperienceChecks -SkipHealthChecks

• Discharging, basically, “all” routines used to check DAG database integrity and health, this cmdlet will attempt to mount the db, accepting to lose data. Several mechanisms are in place to avoid this risk to occur, but it is impossible to ensure “no risk” through this method.

Symptoms persists?YES NO

Page 20: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

22

Database level issuesFailback:• Get-MailboxDatabaseCopyStatus DB_Name• Prior to execute failback, check columns “Status” & “ContentIndex

State”, during the cmdlet above. Show present “Healthy” for both. Otherwise, failback will fail.

• If any other status is present, try “DB Copy Rebuild” and/or “DB Catalog Rebuild” operations.

• Then, the failback occurs using “Move-ActiveMailboxDatabase”:• Move-ActiveMailboxDatabase -Identity <“DB_Name”> -

ActivateOnServer <“Server_Name”>Symptoms persists?

YES NO

Page 21: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

23

Exchange Server level issuesSymptoms and common causes:• Common causes and symptoms related at “Database level”,

however, affecting all databases present in a given server;• Exchange server services won’t start, logging errors at Event Viewer;• Windows Server is corrupted and O.S. is lost;• Damaged hardware, beyond repair.

Symptoms match?

YES NO

Page 22: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

24

Exchange Server level issuesExchange Role presenting issues:

Mailbox Server

Client Access Server /Hub Transport Server

Dial-Tone Database*Return

Page 23: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

25

Exchange Server level issuesMailbox Server:

• Reset computer account for the affected server, through ADUC (Active Directory Users and Computers), or any other supported method.

• Reinstall Operation System exactly as the server was configured with, prior the crash, and provide the same FQDN (full qualified domain name)of the lost server.

• It is not possible to recover a server, using another server name or O.S. version.

• Reconfigure all Network adapter to the values of the lost server.• Do not join the domain.

Mailbox Server typeDAG

Standalone

Page 24: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

26

Exchange Server level issuesMailbox Server (DAG):

• Install O.S and Exchange Server pre-requisites, hotfixes, and so on.• Tip: Using an elevated CMD or EMS, access the Exchange Installation folder and execute: servermanagercmd -ip exchange-

typical.xml (this script installs Exchange pre-requisites (only) for all the roles. There are other scripts on this folder).• Using Exchange Management Shell of other server:a. Remove-MailboxDatabaseCopy DB_Name\Server_lost_Nameb. Remove-DabaseAvailabilityGroupServer -Identity <DAG_Name> -MailboxServer <Server_Lost_Name>

-ConfigurationOnlyc. cluster.exe /cluster:<DAG Name> Node <Server Name> /Evict (Force removal of lost server from cluster database).• Add the new server (but with the same old name) to Active Directory domain, again.• At the new server, open elevated command prompt.• At Exchange 2010 Installation folder path, execute: Setup /m:RecoverServer• If there are healthy database copies of this server, at the other DAG members:a. Add-DatabaseAvailabilityGroupServer -Identity DAG_Name –MailboxServer <Server_Recovered_Name> b. Add-MailboxDatabaseCopy -Identity <DB_Name> -MailboxServer <Server_Recovered_Name> c. If something fails, during this process, it is possible to solve issues by using “reseed” process, at “Database level issues”.• If there are no remain copies for this server at DAG members, repeat step “a.” above and, then, recover DB's from

backup.

Symptoms persists?

YES NO

Page 25: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

27

Exchange Server level issuesMailbox Server (Standalone):

• Install O.S and Exchange Server pre-requisites, hotfixes, and so on.• Tip: Using an elevated CMD or EMS, access the Exchange Installation folder and

execute: servermanagercmd -ip exchange-typical.xml (this script installs Exchange pre-requisites (only) for all the roles. There are other scripts on this folder).

• Add the server to Active Directory domain, again. Same FQDN (full qualified domain name) and IP configurations.

• Access the elevated cmd prompt at the recovered server.• At installation folder path for Exchange 2010, execute: Setup /m:RecoverServer• As we are considering a Standalone Mailbox Server, there are no database

copies on other servers, so restore from backup is the only way to recover database data.

Symptoms persists?

YES NO

Page 26: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

28

Exchange Servers level issuesClient Access Server/Hub Transport Server:

• Install O.S and Exchange Server pre-requisites, hotfixes, and so on.• Tip: Using an elevated CMD or EMS, access the Exchange Installation folder and execute:

servermanagercmd -ip exchange-typical.xml (this script installs Exchange pre-requisites (only) for all the roles. There are other scripts on this folder).

• Reset computer account at AD (example: via AD Users and Computers) for the affected server.• Add the server as a domain joined Active Directory computer, again. • Access the server to be recovered, and executed elevated CMD prompt.• At installation folder path for Exchange 2010, execute: Setup /m:RecoverServer• Reconfigure NLB, CAS Array, customizations for OWA, Certificates (SSL), and etc., as needed.

Symptoms persists?

YES NO

Page 27: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

29

Time to restore back-up OR contact Microsoft SupportIf you reached this page...The issue your Exchange is facing is not “regular”, or it is not enough to use the knowledge presented on this document to deal with it. Next steps:• If servers are ok, and you just need the data, then use a restore from

your backup;• Or call the Microsoft Support Team (PSS), to get help from a

representative, specialized in your affected product. See the link below, for contact information:

• Using Microsoft Product Support Services• http://technet.microsoft.com/en-us/library/dd346877.aspx

Return

Page 28: Exchange Server 2010 & 2013: Disaster Recovery – Troubleshooter v.1.0

30