chuck lathrope (@sqlguychuck) email: [email protected] blog: 6/12/2010 sqlsaturday #43

32
Replication Performance Tuning and Troubleshooting for SQL Server Chuck Lathrope (@SQLGuyChuck) Email: [email protected] Blog: www.sqlwebpedia.com 6/12/2010 SQLSaturday #43

Upload: isaiah-goodrick

Post on 30-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1

Chuck Lathrope (@SQLGuyChuck) Email: [email protected] Blog: www.sqlwebpedia.com 6/12/2010 SQLSaturday #43 Slide 2 Chuck Lathrope 20 years in IT and Engineering, mainly supporting Microsoft products Top 5 finalist for Exceptional DBA award 2009. Past 8 years have been in SQL Development and Database Administration. Currently, I am the Database Operations Manager for Demand Media. I support 13 publication servers, with over 60 subscribers on the WAN. I blog on www.sqlwebpedia.com Twitter @SQLGuyChuckwww.sqlwebpedia.com Slide 3 What I Will Cover Today Replication architecture designs Network and server optimizations Using Replication Agent Profiles Monitoring replication Troubleshooting replication Slide 4 Replication Architecture Designs Server to Client data replication Exchanging data with mobile users Consumer point of sale (POS) applications Integrating data from multiple sites (regional office locations) Server to Server data replication Improving scalability and availability Data warehousing and reporting Integrating data from multiple sites Integrating heterogeneous data (Oracle, etc.) Offloading batch processing Slide 5 Network and Server Topology Example Slide 6 Windows Server NIC Tweaks Windows 2003 SP2+ consider turning off TCP Chimney (Windows 2008 should be ok): NETSH int ip set chimney disabled Network binding order Put your team NIC at top Disabled NICs at bottom Update your NIC drivers! Slide 7 Registry Tweaks for NIC HKLM\System\CurrentControlSet\Services\Netbt\Para meters NodeType = 8 for environments without WINS. EnablePMTUDiscovery = 1 (Ignore security concerns as DoS issue was patched long ago.) HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Par ameters\Interfaces\ TcpAckFrequency = 1 Slide 8 Server Configuration Use a dedicated Distribution server If on Windows 2003 make sure your disks are aligned with diskpart when you create partition. Format disk with 64KB allocation unit size. Default is 4KB. RAID 10 if you can afford it. Published databases LOG file better be on RAID 10. Optimize your files Service account should have Perform Volume Maintenance Tasks Local Security Policy right. Slide 9 Database File Optimizations Dont forget that default values for new databases are sub- optimal. I recommend 4 data files, 1 log file. Size all data files the same, give 1 years worth of empty space, growth size should be large for data and not percentage. Log growth of 300-500MB. Slide 10 Replication Config Recommendations With many subscriptions, use Pull Subscriptions method. Keep snapshot folder away from DB files. Run agents continuously instead of on very frequent schedules. Potentially use SubscriptionStreams option on Dist Agent or create many publications (parallel threads) Minimize use of publication filters Create Agent Profiles for more performance tweaking Slide 11 Create Agent Profiles Replication Snapshot Agent Replication Log Reader Agent Replication Distribution Agent Replication Merge Agent Replication Queue Reader Agent Create multiple to tailor to your infrastructure Slide 12 Custom Agent Profiles Create a new Replication Profile in Replication Monitor by right clicking a subscriber and select Agent Profile, copy the default one and modify it; uncheck the Show only parameters used in this profile checkbox and tweak and test values. Slide 13 Agent Properties to Change CommitBatchSize - 1000 (approx max number of batches) CommitBatchThreshhold - 2000 (approx max total commands for all batches - definitely test) HistoryVerboseLevel 1 MaxBCPThreads - 4 (Nothing to do with ongoing operations, just when you create a snapshot it won't be single threaded.) TransactionsPerHistory - 1000 (Just limits amount of updates you get in Repl Monitor, tweak to your comfort level) QueryTimeout 4000 PacketSize - 12288 (This is on a good network. Adjust in 4096 increments +/- until SQL Agent Job doesn't crash - Pre SQL 2005 SP3 there is a bug with large packetsizes.) Slide 14 Large Data Change Performance Tips Change to using stored procedures to update/delete many rows at subscriber(s). Default is definition only. Slide 15 ReplMon Performance Considerations Replication Monitor can be a performance hindrance. If you have a large number of publications or subscriptions, consider setting a less frequent automatic refresh schedule for the user interface. Avoid concurrently running multiple instances of Replication Monitor. Avoid registering a large number of Distributors and setting Replication Monitor to automatically connect to all of them. Slide 16 Evaluate Round Trip Latency Use tracer tokens Or look at Subscriber Latency (full round-trip) or Undist Commands Slide 17 Long Distance Snapshot Delivery With long distance or slow VPN tunnel Consider going out through the internet Control when snapshots occur dont let them control you. Manually zip up data with 7zip application (dont use SQL snapshot compression). And copy data. Deliver one copy to remote location and configure other servers to use that local server share. Slide 18 Network and Server Topology Example Copy files to local DC server \\sub1\snapshot Slide 19 Alerting with SQL Server In replication monitor, Warnings tab for Publication: Slide 20 Alerting with SQL Server, cont. Configuring alert from Replmon prepopulates error number for you. Slide 21 My Custom Error Emails Slide 22 My Hourly Replication Status Email This is a custom email that shows Status and Undelivered Commands total. Slide 23 TSQL to View Errors This error logging table is good start, but may not have all issues, so do check msrepl_errors table. select top 10 * from msdb.dbo.sysreplicationalerts order by alert_id desc select top 300 * from distribution.dbo.MSrepl_errors (nolock) where time > getdate() -.05 order by time desc Get the xact_seqno value and command_id value exec sp_browsereplcmds @xact_seqno_start = '0x00000DDC0003B16D000600000000', @xact_seqno_end = '0x00000DDC0003B16D000600000000' --,@publisher_database_id= null, @command_id = 5 Result: {CALL [sp_MSupd_dboOrders] (,,,,,,,,,,,,8,,14318718,0x0010)} It is doing an update for Orders with OrderID 14318718 Slide 24 Example Troubleshooting Scenario: Subscription expired, so we needed to reinitialize all subscribers You could use the GUI to do this, but wanted quicker method exec sp_reinitsubscription @publication = N'Orders', @subscriber = N'all' Msg 3933, Level 16, State 1, Procedure sp_MSrepl_getdistributorinfo, Line 93 Cannot promote the transaction to a distributed transaction because there is an active save point in this transaction. Slide 25 Example Troubleshooting, cont. First thought was to see if Distributed Trans work (you may need to create linked server for this test): begin distributed transaction select * from [RemoteServer].master.sys.databases commit tran I got a list of databases, so all was good. If you dont check out my troubleshooting page at http://www.sqlwebpedia.com/content/msdtc- troubleshooting http://www.sqlwebpedia.com/content/msdtc- troubleshooting Slide 26 Example Troubleshooting, cont. My next thought was to check for DDL triggers on published database: DISABLE TRIGGER [tr_Log_DDLEvent] ON DATABASE Checking SQL Error Log, found this message: Date6/2/2010 3:25:18 PM LogSQL Server (Current - 6/3/2010 11:59:00 PM) Message Login failed for user 'DM\SQLService'. Reason: Failed to open the explicitly specified database. [CLIENT: ] This is the service account, so I should never see this issue! Restarted SQL Service and all was good. Slide 27 Event Viewer Errors Date6/4/2010 4:52:23 PM LogSQL Server (Current - 6/4/2010 4:32:00 PM) Sourcespid52 Message Error: 14151, Severity: 18, State: 1. Message: Replication-Replication Distribution Subsystem: agent BLVWDB03-Orders-Orders-BLVWDB02-62 failed. The subscription(s) have been marked inactive and must be reinitialized. NoSync subscriptions will need to be dropped and recreated. Slide 28 Add Verbose Output to Agent Job Refer to http://support.microsoft.com/kb/q312292 http://support.microsoft.com/kb/q312292 Says append these parameters to replication "Run agent" job step. -Output C:\ReplOutput.txt -OutputVerboseLevel 2 -Publisher [VDB1] -PublisherDB [Orders] -Distributor [VDB3\INS2] -DistributorSecurityMode 1 -Continuous -Output C:\ReplOutput.txt -OutputVerboseLevel 2 Slide 29 Skip a Transaction Row Exec sp_setsubscriptionxactseqno @publisher='vdb1',@publisher_db ='AdvWorks',@publication = 'Orders',@xact_seqno=0x0015731B0002331C use Reports; --must be active at the subscriber. exec sp_setsubscriptionxactseqno @publisher='vdb1',@publisher_db ='AdvWorks',@publication = 'Orders', @xact_seqno=0x0015731B0002331C ORIGINAL XACT_SEQNOUPDATED XACT_SEQNOSUBSCRIPTION STREAM COUNT 0x0015731B0002331500280000x0015731B0002331C00011 Have to restart the distribution agent job for it to work If you are not active in subscriber DB, you will see this error: Msg 20017, Level 16, State 1, Procedure sp_setsubscriptionxactseqno, Line 69 The subscription on the Subscriber does not exist. Credit: http://blogs.msdn.com/chrissk/archive/2009/09/08/how-to-skip-a-transaction-in-sql- 2005-2008-transactional-replication.aspxhttp://blogs.msdn.com/chrissk/archive/2009/09/08/how-to-skip-a-transaction-in-sql- 2005-2008-transactional-replication.aspx Slide 30 Replmon Word of Warning Looks like one error, but Latency is 0 on Maildb subscription and in reality it has failed but is in retry mode. Slide 31 Takeaway Points Understand your network Use a dedicated distributor Optimize your data files Test your custom Agent Profile config options Dont rely on visually watching Replication Monitor for error monitoring. Minimize use of filters on publications. Control subscription snapshots over WAN. Slide 32 Useful Links and Used References Enhancing Transactional Replication Performance http://msdn.microsoft.com/en-us/library/ms151762.aspx http://msdn.microsoft.com/en-us/library/ms151762.aspx Publishing Stored Procedure Execution in Tran Repl http://msdn.microsoft.com/en-us/library/ms152754(SQL.90).aspx http://msdn.microsoft.com/en-us/library/ms152754(SQL.90).aspx Advanced Merge Replication Conflict Detection and Resolution http://technet.microsoft.com/en-us/library/ms151257.aspx http://technet.microsoft.com/en-us/library/ms151257.aspx TCP/IP and NBT configuration parameters for Windows XP (ignore XP) http://support.microsoft.com/kb/314053/en-ushttp://support.microsoft.com/kb/314053/en-us Importance of network binding order http://technet.microsoft.com/en-us/library/dd391967(WS.10).aspx http://technet.microsoft.com/en-us/library/dd391967(WS.10).aspx An update to turn off default SNP features is available for Server 2003 http://support.microsoft.com/default.aspx?scid=kb;EN-US;948496 http://support.microsoft.com/default.aspx?scid=kb;EN-US;948496 http://www.microsoft.com/whdc/system/sysperf/perf_tun_srv.mspx We are hiring! http://www.demandmedia.com/careershttp://www.demandmedia.com/careers