in the beginning, there was the log - webinar with martin hubel

32
In the beginning, there was the log Martin Hubel + 1 905-764-7498 [email protected] 1

Upload: embarcadero-technologies

Post on 10-Jul-2015

93 views

Category:

Software


2 download

TRANSCRIPT

Page 1: In the Beginning, There Was the Log - Webinar with Martin Hubel

In the beginning,there was the log

Martin Hubel+ 1 905-764-7498

[email protected]

1

Page 2: In the Beginning, There Was the Log - Webinar with Martin Hubel

Agenda

1. The role of the database recovery log2. How backups fit in3. Recovery scenarios4. Planning for recovery

2

Page 3: In the Beginning, There Was the Log - Webinar with Martin Hubel

Disclaimer

• Every attempt has been made to be accurate with product information. Trademarks are the property of their respective owners.

• I work mainly on DB2, but have used and supported other products.

• All DBMS mentioned in today’s webinar are commercial grade, used by hundreds of companies to manage their enterprises. No attempt is made here to compare DBMS products for functionality.

Copyright©2014 Martin Hubel Consulting Inc. 3

Page 4: In the Beginning, There Was the Log - Webinar with Martin Hubel

Assumptions

• This presentation is about OLTP applications• The need for backups is known• The need for recovery is also understood

• Databases are enabled for recovery• BI / DW have different recovery requirements

• Data is recovered / rebuilt / reload from another source

Copyright©2014 Martin Hubel Consulting Inc. 4

Page 5: In the Beginning, There Was the Log - Webinar with Martin Hubel

The database recovery log

• Log contains information necessary to:• Redo completed units of work• Undo incomplete units of work• Rebuild system status

• Today’s presentation will deal mainly with units of work (transactions) and their effect on the recovery log

Copyright©2014 Martin Hubel Consulting Inc. 5

Page 6: In the Beginning, There Was the Log - Webinar with Martin Hubel

A successful transaction

• Single transaction can contain many update actions• All updates must complete • Then a commit makes changes permanent• Successful termination is recorded in the recovery log

• Any failure rolls changes back to beginning• The DB is considered consistent before the transaction and after the

commitCopyright©2014 Martin Hubel Consulting Inc. 6

•Begin UR •End UR

•INS A •UPD B •UPD C •INS A•DEL D•INS E

•One Business Transaction

Page 7: In the Beginning, There Was the Log - Webinar with Martin Hubel

Unit of work

• Updates performed in the subsystem between two points of consistency

• Point of consistency (commit point) is a point where all recoverable data for an application is consistent with other data• Equivalent terms: Synchpoint, Logical Unit of Work or

Unit of Recovery (UR)• Some environments require a 2 phase commit

Copyright©2014 Martin Hubel Consulting Inc. 7

Page 8: In the Beginning, There Was the Log - Webinar with Martin Hubel

Batch transactions

• Batch jobs handle transaction can contain many update actions• Many business transactions are included in a single unit of work• The commit makes changes permanent for all transactions

• Any failure rolls changes back to beginning• Many tens, hundreds or even thousands of transactions are processed in

a single UR

Copyright©2014 Martin Hubel Consulting Inc. 8

•Begin UR •End UR

Page 9: In the Beginning, There Was the Log - Webinar with Martin Hubel

Rollback

Copyright©2014 Martin Hubel Consulting Inc. 9

• During rollback, data changes are not removed• The log is read and the changes reversed• The data is returned to its last consistent state (last commit)

• Some badly designed applications issue rollback statements following if validation fails• Can be a lot of log activity for nothing• Always complete all validation prior to performing first update

•Begin UR •End UR

•INS A •UPD B •UPD C •INS E

•One Business Transaction

•DEL E •UPD C` •UPD B` •DEL A

•Failure

Page 10: In the Beginning, There Was the Log - Webinar with Martin Hubel

Performance of new application

• A new financial application had a light transaction processing load• Transactions were complex and had a lot of validation and

updates to perform• Users regularly had to correct input before the transaction was

ready for processing• Performance was poor. Design showed:

• After each validation check, the SQL updates to that point were performed

• If a further check failed, the updates were reversed via rollback• This problem was found via a high volume of log activity

• Much of it for no benefitCopyright©2014 Martin Hubel Consulting Inc. 10

Page 11: In the Beginning, There Was the Log - Webinar with Martin Hubel

Transaction Recovery/Backout

• Some DBMS products default to this type of recovery (rollback of incomplete URs only)• Logging (log retain) in these products must be turned on for

full recoveryCopyright©2014 Martin Hubel Consulting Inc. 11

TIMEOUT

Work RollbackWait

Work Rollback

DEADLOCK

Work

Wait

Continue

a b

b a

Page 12: In the Beginning, There Was the Log - Webinar with Martin Hubel

Log files

• “In the beginning, there was the log”• DBMS writes to the active log file

• May be dual logs or mirrored by DBMS May be on RAID devices at hardware level

• Logs use buffered I/O For changes to data

• Logs are written synchronously at commit time• In this way, the log is close to being correct if failure occurs• Design I/O system to avoid waits for log writes

• DBMS must be able to write changes to the log

Copyright©2014 Martin Hubel Consulting Inc. 12

Page 13: In the Beginning, There Was the Log - Webinar with Martin Hubel

Archiving logs

• Current log is called the active log• Log files fill up and switch automatically to new files

• Called archive logs• Archive files may remain in same directory for some period• May be automatically offloaded to some facility like TSM

• Problems can occur in the archival process• If log file system fills, the DBMS stops!

Active log

Copyright©2014 Martin Hubel Consulting Inc. 13

Other storage

Archive logs

Page 14: In the Beginning, There Was the Log - Webinar with Martin Hubel

Batch, long running, and large transactions

• Thousands (or more) updates can be performed, e.g. by: • A batch job • A single SQL statement

• Long-running transactions cause problems:• Many locks taken during the updates• Many log records and log files written before a commit

• Commit frequency must be managed• Rollback is also problematic

• Requires log to be read, and to write the opposite action to reverse the change

Copyright©2014 Martin Hubel Consulting Inc. 14

Page 15: In the Beginning, There Was the Log - Webinar with Martin Hubel

“The audit table is too big”

• The new DBA/Linux sysadmin notices the size of the audit table is 30 million rows• Issues a DELETE FROM AUDIT_TABLE

WHERE DATE < ‘60 days ago’• 10M rows will be deleted

• The effect:• All users locked out of the application • No estimate of completion time

Copyright©2014 Martin Hubel Consulting Inc. 15

Page 16: In the Beginning, There Was the Log - Webinar with Martin Hubel

“The audit table is too big” - 2

• After many complaints and about 2 hours, the DBA cancels the Delete

• Rollback starts:• Rollback has to read the log to finds the updates and

write the opposite• As such, rollback typically takes twice as long

• The users continue to be unhappy and are now more vocal

Copyright©2014 Martin Hubel Consulting Inc. 16

Page 17: In the Beginning, There Was the Log - Webinar with Martin Hubel

“The audit table is too big” - 3

• The DBA, using his sysadmin kills, starts killing DB2 processes• Kill -9 has immediate effects to stop DB2 and the rollback

• But now, the rollback is incomplete• At restart, DB2 must remove the effect of the incomplete

rollback and start the full rollback afterwards• Estimated full recovery time is 24 hours

• Some behind-the-scenes involvement reduces the outage to 8 hours on a 24x7 system

• CIO asks: “What should we do with the DBA?”

Copyright©2014 Martin Hubel Consulting Inc. 17

Page 18: In the Beginning, There Was the Log - Webinar with Martin Hubel

... And Now

• A word from Embarcadero...

Copyright©2014 Martin Hubel Consulting Inc. 18

Page 19: In the Beginning, There Was the Log - Webinar with Martin Hubel

Can the log be kept forever?

• Hypothetically yes, the answer is no• Not practical• Recovery would take too long for one thing• Log retention would also be impractical

• Backups taken within the DBMS are recorded and provide starting points for recovery

• Once all databases or tables (tablespaces) have been copied, the oldest logs can be deleted

Copyright©2014 Martin Hubel Consulting Inc. 19

Page 20: In the Beginning, There Was the Log - Webinar with Martin Hubel

Backup copies

• Backups provide a starting point for the recovery process• Need at least one full backup• Two generations are safer

• DBMS options for backups:• Database• Table or tablespace• Incremental or delta• Online or offline

• Once all databases have a valid backup, the older log files can be deleted• Set generous retention periods

Copyright©2014 Martin Hubel Consulting Inc. 20

Page 21: In the Beginning, There Was the Log - Webinar with Martin Hubel

Gaps in the log

• A big recovery consideration is to ensure there are no gaps within the log range• A gap indicates that some log records are missing,

and DBMS is unable to determine what those records might have contained

• As such, forward log processing stops• It is advantageous to turn logging off for certain

situations or operations• E.g, large loads

Copyright©2014 Martin Hubel Consulting Inc. 21

Page 22: In the Beginning, There Was the Log - Webinar with Martin Hubel

Implications of turning off logging

• Logging is not free• Significant performance impact for certain operations

such as batch updates or load operations• Not logged operations affect the ability to forward

recover objects using the log• DBMS does not know what or how data was changed

• Therefore, DBMS requires a backup following not logged operations to re-establish a recovery point

Copyright©2014 Martin Hubel Consulting Inc. 22

Page 23: In the Beginning, There Was the Log - Webinar with Martin Hubel

Production mortgage system• DBA group was responsible for backups

• Forgot to take regular backups• Last backup was January 26

• System group managed the logs• Set a “reasonable” retention period of 60 days• On April 1, they deleted the logs older than Feb 1

• On April 8, a hardware problem occurred• Production was recovered to January 26

Copyright©2014 Martin Hubel Consulting Inc. 23

Page 24: In the Beginning, There Was the Log - Webinar with Martin Hubel

Application recovery scenarios

• Transaction recovery/backout• Application backout• Point-in-time recovery• Checkpoint restart• Data synchronization• Data inconsistencies• Out of space• Media failure• Accidentally deleted data

Copyright©2014 Martin Hubel Consulting Inc. 24

Page 25: In the Beginning, There Was the Log - Webinar with Martin Hubel

Recovery guidelines

• Test your recovery procedures• Make a backup copy prior to recovery - things can get

worse• Make a copy after recovery if you think it might

happen again• Make two copies using the DBMS backup utility• Periodically run integrity checking utilities to ensure

recoverability

Copyright©2014 Martin Hubel Consulting Inc. 25

Page 26: In the Beginning, There Was the Log - Webinar with Martin Hubel

Common user errors - 1

• Delete or write over archive log or image copy• Without necessary backups, recovery may be

impossible• Possible storage management issue

• Misuse of disk level dump/restore• Valid for disaster planning, but dump everything – Logs DBMS system objects User data

Copyright©2014 Martin Hubel Consulting Inc. 26

Page 27: In the Beginning, There Was the Log - Webinar with Martin Hubel

Common user errors - 2

• Common Programming Errors• Logical unit of work identification• Commit/CHKP placed incorrectly• Too many commits• Coding methods may vary throughout application,

e.g. Use of separate read and write routines Checking of SQL return code codes

Copyright©2014 Martin Hubel Consulting Inc. 27

Page 28: In the Beginning, There Was the Log - Webinar with Martin Hubel

Application Backout

• Required when multiple jobs running concurrently against same database

• Work was committed against database• Backout may not be possible or practical in all cases

Copyright©2014 Martin Hubel Consulting Inc. 28

Startof Job

Time

NormalJob End Job 1

Job 2

Job 3

Job 4

Backout, repair, and rerun

Page 29: In the Beginning, There Was the Log - Webinar with Martin Hubel

SQL change in production

• “Oops, we thought that was the test system”• An SQL error changed all addresses to the same

value• 780,000 rows in table

• “Martin, please change them back”• No, you can’t restore from a backup New customers have signed up Old customers may have updated their information

Copyright©2014 Martin Hubel Consulting Inc. 29

Page 30: In the Beginning, There Was the Log - Webinar with Martin Hubel

SQL change in production

• While this was pressing, it was not a “show stopper”• Fortunately, I had unloaded this table for some reason

• This became a source for the correct addresses• Alternatives might have been a redirected restore /

transportable tablespace• We were able to load the backup table

• Found SQL on internet and adapted it• Ultimately updated 362,000 addresses

Copyright©2014 Martin Hubel Consulting Inc. 30

Page 31: In the Beginning, There Was the Log - Webinar with Martin Hubel

Summary

• All update activity is recorded in the DBMS log• Log performance is part of overall DBMS

performance• Backups and logs are both used for recovery• Plan and test recovery scenarios prior to needing them

Copyright©2014 Martin Hubel Consulting Inc. 31

Page 32: In the Beginning, There Was the Log - Webinar with Martin Hubel

Martin HubelMartin Hubel Consulting Inc.

[email protected]

In the beginning, there was the log

32