cpanelcon 2014: innodb anatomy

32

Upload: ryan-robson

Post on 07-Apr-2017

3.924 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: cPanelCon 2014: InnoDB Anatomy
Page 2: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy

Page 3: cPanelCon 2014: InnoDB Anatomy

The InnoDB EngineIntroduction to InnoDB

• Currently the default in MySQL (as of 5.5)

• Referential/Structural Integrity

• Consistent data

• Transactional

Page 4: cPanelCon 2014: InnoDB Anatomy

InnoDB is atomic in that its transactions have only two possible outcomes - complete fully, or fail completely.

• SUCCESS:

• FAILURE:

All changes committed.

All changes rolled back.

InnoDB Anatomy – ACID Compliance

ATOMICITY

COMMITROLLBACKUnchanged Data

Changes Applied

Page 5: cPanelCon 2014: InnoDB Anatomy

• Data stays consistent before, during, and after a transaction.

• No conflict of “versions”

• Successful transactions end with a commit. InnoDB maintains optimism.

Introduction to InnoDB Structure

CONSISTENCY

Valid State Work performed

Still a valid state

Page 6: cPanelCon 2014: InnoDB Anatomy

• Transactions cannot interact with each other

• Adjustable level of isolation.*

• Row-level locking

Introduction to InnoDB Structure

ISOLATION

* Isolation level changed via the transaction-isolation configuration option.

Page 7: cPanelCon 2014: InnoDB Anatomy

• Atomic transactions keep data durable.

• Changes are permanent once committed.

• Doublewrite buffer helps to recover from crashes that occur during page writes.

Introduction to InnoDB Structure

DURABILITY

Page 8: cPanelCon 2014: InnoDB Anatomy

Explore the InnoDB structure within the file system, and at its lower levels, to find out how it can affect database operations.

InnoDB Anatomy

The Goal

Page 9: cPanelCon 2014: InnoDB Anatomy

Understanding the Physical Structure of Data in InnoDB

Page 10: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical StructurePhysical File Structure

Data

Dire

ctor

y(D

efau

lt: /v

ar/li

b/m

ysql

) Ibdata1System Tablespace

ib_logfile0Redo/Transaction Log File

ib_logfile1Redo/Transaction Log File

Database Folder

table.ibdTablespace File

table.frmFormat File

Page 11: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

File System

Partition

Disk Block

Allocation Unit

File

Inode

Page 12: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

User Records (Index Pages)…

First Inode Page Number

File HeaderInsert Buffer Bitmap

IBD File

Page 13: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

PAGE…

FIL Header - 38 bytes…

FIL Trailer - 8 bytes1638

4

Page 14: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

1MB Page 1

…Page 64

EXTENT

Page 15: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

ExtentExtent

SEGMENT

Page 16: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

InnoDB

Tablespace

Page

Extent

Segment

Inode

File segment ID

Extent Data (Free/Partial/Full Listing)

Page 17: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Physical Structure

Tabl

espa

ceSegment

Segment

Segment

Segm

entExtent

Extent

Extent

Extent

Extent

Extent Exte

nt

Page 18: cPanelCon 2014: InnoDB Anatomy

SYSTEM TABLESPACEibdata File

InnoDB Anatomy – System Tablespace

Undo Logs

Rollback Segment

Rollback Segment

Data Dictionary

SYS_TABLES

SYS_INDEXES

SYS_COLUMNS

SYS_FIELDS

Doublewrite Buffer

Block 1 (64

Pages)

Block 2 (64

Pages)

Change Buffer

Insert Buffering

Update Buffering

Purge Buffering

Undo Log Space

Rollback Segment

Page 19: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – System Tablespace

Separate your Undo Logs (5.6+ only)innodb_undo_logs; innodb_undo_tablespaces; innodb_undo_directory

I/O to the undo logs is random, instead of sequential like some other areas of InnoDB. Because of this, it makes sense to separate your Undo Log tablespaces out from the system tablespace onto a disk that handles random reads and writes more effectively, such as a SSD.

Use the Information Schema, or innodb_table_monitor, to view the Data Dictionary table data.

In 5.6, the information_schema database contains INNODB_SYS* tables that allow you to view data dictionary information directly in MySQL. Alternatively, you can create a table called “innodb_table_monitor” to dump the data dictionary into the MySQL error logs.

How can you use this?

Page 20: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – System Tablespace

Do you need the Doublewrite buffer?innodb_doublewrite

With Doublewrite buffer enabled, there is a 5-10% impact on I/O. If you operate on a transactional file system, you disable this to avoid this impact.

Customizing Change Buffering for your Workloadinnodb_change_buffering

Change buffering, by default in 5.5+, encompasses insert, update, and delete buffering. If your workload consists almost entirely of one or the other, it can make sense to limit this down to only one type of buffering.

How can you use this?

Page 21: cPanelCon 2014: InnoDB Anatomy

InnoDB in Memory and on Disk

Memory

Buffer Pool

Insert Buffer

Log BufferAdditional Memory

Disk

System TablespaceDoublewrite Buffer

Transaction Log Files

Insert Buffer

Undo Logs

Rollback Segment

Data Dictionary

Undo Buffering

Indexing

Thread Processing

Tablespace Files

Page 22: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – PagesPage Headers/Trailers

Name Byte Length Offset DescriptionFIL_PAGE_SPACE 4 0 Space ID

FIL_PAGE_OFFSET 4 4 Page Number

FIL_PAGE_PREV 4 8 Previous Page (in key order)

FIL_PAGE_NEXT 4 12 Next Page (in key order)

FIL_PAGE_LSN 8 16 LSN of page’s latest log record

FIL_PAGE_TYPE 2 24 Page Type

FIL_PAGE_FILE_FLUSH_LSN 8 26 Flushed-up-to LSN (only in space ID 0, page 0)

FIL_PAGE_ARCH_LOG_NO 4 34 Latest archived LSN (only in space ID 0, page 0)

FIL Header (38)

FIL Trailer (8)Name Byte Length Offset Description

FIL_PAGE_END_LSN 8 16376 Low 4 bytes: Checksum, Last 4 bytes: FIL_PAGE_LSN

storage/innobase/include/fil0fil.h

Page 23: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy - Demonstration

Changing values directly

At the byte level, these values can be changed directly in many situations to “trick” InnoDB in one way or another. One good example of this is to get around a page checksum failure. You can change the stored checksum to match the calculated checksum, bypassing the crash and often allowing you sufficient access to your records.

How can you use this?

InnoDB: Page checksum 2047964429, prior-to-4.0.14-form checksum 4196043695InnoDB: stored checksum 1873408413, prior-to-4.0.14-form stored checksum 1946395024# printf '%X\n' 2047964429; printf '%X\n' 41960436957A11750D Primary calculated checksumFA1A8BAF “Old-style” calculated checksum# expr 16384 \* 6 Example Page 698304 Starting byte offset for Page 6Writing the primary calculated checksum over the stored value of page 6:# printf ‘\x7A\x11\x75\x0D’ | dd of=table.ibd bs=1 seek=98304 count=4 conv=notrunc

Writing the “old-style” calculated checksum over the stored value of page 6:# printf ‘\x7A\x11\x75\x0D’ | dd of=table.ibd bs=1 seek=98304 count=4 conv=notrunc

Page 24: cPanelCon 2014: InnoDB Anatomy

•Stored in 2 files by default (ib_logfile0/1)

•Treated as single file•Circular buffer

LOG BLOCKHeader (12)Log Records

Trailer (4)…

LOG BLOCKHeader (12)Log Records

Trailer (4)…

LOG BLOCKHeader (12)Log Records

Trailer (4)…

ib_lo

gfile

0ib

_logfi

le1

InnoDB Anatomy – Redo Logs

Structure• Log blocks are 512 bytes•Each block contains

checkpoint data

The Logical Log FileThe Redo Logs

Page 25: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Redo Logs

Optimized log file sizeinnodb_log_file_size

Larger size means less checkpoint flushing required, reducing I/O impact. Balance with expected recovery time required as a result of the size (less of an issue in 5.6).General Formula: (Current LSN – LSN 60 seconds later) * 60 / 1024 / 1024

Optimized log buffer sizeinnodb_log_buffer_size

Log buffer allows transactions to move forward without having to write the log to disk before commit. Increased size allows larger transactions to run without requiring writes to disk before a commit is performed.

How can you use this?

Page 26: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Index PagesINDEX Pages - B+Tree Structure

•Efficient method of storing data on disk in a tree format.•Actual records stored in leaf pages (level 0).•Root-level pages exist at the top of the tree structure.•Non-leaf pages contain only pointers to leaf pages.

Level 0

Level 1

Level 2 Root

Non-Leaf

Leaf Leaf

Non-Leaf

Leaf

Page 27: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Index PagesB+Tree Structure – Basic Index Example

Root NodeCustomer IDs

1-500

Non-Leaf 1-250

Non-Leaf251-500

Leaf Node251-260

Leaf Node261-270

Leaf Node1-10

Leaf Node11-20

Page 28: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Index PagesINDEX Pages

•Not physically in order

•User data “grows down”

•Page directory “grows up”

FIL Header (38)

… Page Directory

FIL Trailer (8)

INDEX Header (36)FSEG Header (20)

System Records (26)

User Data …

EMPTY

Page 29: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Index PagesINDEX Page Header (after FIL Header)

Name Byte Length Offset DescriptionPAGE_N_DIR_SLOTS 2 38 + 0 Number of Slots in Page Directory

PAGE_HEAP_TOP 2 38 + 2 Pointer to Record Heap Top

PAGE_N_HEAP 2 38 + 4 Number of Records in Heap

PAGE_FREE 2 38 + 6 Pointer to start of page’s free-record list

PAGE_GARBAGE 2 38 + 8 Number of bytes in “deleted” records

PAGE_LAST_INSERT 2 38 + 10 Pointer to last inserted record, or NULL if this has been reset – eg. by a delete.

PAGE_DIRECTION 2 38 + 12 Last Insert direction, PAGE_LEFT, PAGE_RIGHT …

PAGE_N_DIRECTION 2 38 + 14 Consecutive inserts in the same direction

PAGE_N_RECS 2 38 + 16 Number of user records on the page

PAGE_MAX_TRX_ID 8 38 + 18 Highest ID of transaction that may have modified a record on the page.

PAGE_LEVEL 2 38 + 26 Level of node in index tree

PAGE_INDEX_ID 4 38 + 28 Index ID that page belongs to

storage/innobase/include/page0page.h

Page 30: cPanelCon 2014: InnoDB Anatomy

InnoDB Anatomy – Demonstration

Demonstration

Determining page level on an INDEX pageFirst, find your page’s start byte:# expr 16384 \* 349152The offset of the PAGE_LEVEL value is 26 after the FIL Header (38):# expr 49152 + 38 + 2649216The byte-length is 2# xxd –ps –s 49216 –l 2 customer.ibd0001Page Level: 1

/var/lib/mysql/testdb/

Page 31: cPanelCon 2014: InnoDB Anatomy

Additional Resources

Page 32: cPanelCon 2014: InnoDB Anatomy

Conclusion

• Jeremy Cole• http://blog.jcole.us/• https://github.com/jeremycole/

• Percona• http://www.percona.com/files/percona-live/justin-inno

db-internals.pdf• MySQL Internals Documentation & Source

• http://dev.mysql.com/doc/internals/en/innodb.html• https://launchpad.net/mysql

Sources and Thanks