cs 245: database system principles notes 03: disk organization
DESCRIPTION
CS 245: Database System Principles Notes 03: Disk Organization. Hector Garcia-Molina. Topics for today. How to lay out data on disk How to move it to memory. What we have available: Bytes. 8 bits. What are the data items we want to store?. a salary a name a date a picture. - PowerPoint PPT PresentationTRANSCRIPT
CS 245 Notes 3 1
CS 245: Database System Principles
Notes 03: Disk Organization
Hector Garcia-Molina
CS 245 Notes 3 2
• How to lay out data on disk• How to move it to memory
Topics for today
CS 245 Notes 3 3
What are the data items we want to store?
• a salary• a name• a date• a pictureWhat we have available: Bytes
8bits
CS 245 Notes 3 4
To represent:• Integer (short): 2 bytes
e.g., 35 is 00000000 00100011
• Real, floating pointn bits for mantissa, m for exponent….
CS 245 Notes 3 5
• Characters various coding schemes suggested,
most popular is ascii
To represent:
Example:A: 1000001a: 11000015: 0110101LF: 0001010
CS 245 Notes 3 6
• Booleane.g., TRUE FALSE
1111 11110000 0000
To represent:
• Application specifice.g., RED 1 GREEN 3
BLUE 2 YELLOW 4 …Can we use less than 1
byte/code?Yes, but only if desperate...
CS 245 Notes 3 7
• Datese.g.: - Integer, # days since Jan 1, 1900 - 8 characters, YYYYMMDD - 7 characters, YYYYDDD(not YYMMDD! Why?)
• Timee.g. - Integer, seconds since midnight - characters, HHMMSSFF
To represent:
CS 245 Notes 3 8
• String of characters– Null terminated
e.g.,
– Length givene.g.,
- Fixed length
c ta
c ta3
To represent:
CS 245 Notes 3 9
• Bag of bits
Length Bits
To represent:
CS 245 Notes 3 10
Key Point
• Fixed length items
• Variable length items- usually length given at beginning
CS 245 Notes 3 11
• Type of an item: Tells us how to interpret(plus size if
fixed)
Also
CS 245 Notes 3 12
Data Items
Records
Blocks
Files
Memory
Overview
CS 245 Notes 3 13
Record - Collection of related data
items (called FIELDS)E.g.: Employee record:
name field,salary field,date-of-hire field, ...
CS 245 Notes 3 14
Types of records:
• Main choices:– FIXED vs VARIABLE FORMAT– FIXED vs VARIABLE LENGTH
CS 245 Notes 3 15
A SCHEMA (not record) containsfollowing information
- # fields- type of each field- order in record- meaning of each field
Fixed format
CS 245 Notes 3 16
Example: fixed format and length
Employee record(1) E#, 2 byte integer(2) E.name, 10 char. Schema(3) Dept, 2 byte code
55 s m i t h 02
83 j o n e s 01Records
CS 245 Notes 3 17
• Record itself contains format“Self Describing”
Variable format
CS 245 Notes 3 18
Example: variable format and length
4I52 4S DROF46
Field name codes could also be strings, i.e. TAGS
# F
ield
sCo
de id
entif
ying
fie
ld a
s E#
Inte
ger t
ype
Code
for E
nam
eSt
ring
type
Leng
th o
f str.
CS 245 Notes 3 19
Variable format useful for:
• “sparse” records• repeating fields• evolving formats
But may waste space...
CS 245 Notes 3 20
• EXAMPLE: var format record with repeating fields
Employee one or more children3 E_name: Fred Child: SallyChild: Tom
CS 245 Notes 3 21
Note: Repeating fields does not imply- variable format, nor- variable size
John Sailing Chess --
• Key is to allocate maximum number ofrepeating fields (if not used null)
CS 245 Notes 3 22
Many variants betweenfixed - variable format:
Ex. #1: Include record type in record
record type record lengthtells me whatto expect(i.e. points to schema)
5 27 . . . .
CS 245 Notes 3 23
Record header - data at beginning
that describes recordMay contain:
- record type- record length- time stamp- other stuff ...
CS 245 Notes 3 24
Ex #2 of variant between FIXED/VAR format
• Hybrid format– one part is fixed, other variable
E.g.: All employees have E#, name, deptother fields vary.
25 Smith Toy 2 retiredHobby:chess
# of varfields
CS 245 Notes 3 25
Also, many variations in internal
organization of recordJust to show one: length of
field
3 F310 F1 5 F2 12 * * *
3 32 5 15 20 F1 F2 F3total size
offsets
0 1 2 3 4 5 15 20
CS 245 Notes 3 26
Question:We have seen examples for
* Fixed format and length records* Variable format and length records
(a) Does fixed format and variable lengthmake sense?(b) Does variable format and fixed lengthmake sense?
CS 245 Notes 3 27
Other interesting issues:
• Compression– within record - e.g. code selection– collection of records - e.g. find
common patterns• Encryption
CS 245 Notes 3 28
Encrypting Records
trustedprocessor
newrecord
r dbmsE(r)
E(r1)E(r2)E(r3)E(r4)...
CS 245 Notes 3 29
Encrypting Records
trustedprocessor
searchF(r)=x
dbms??
E(r1)E(r2)E(r3)E(r4)...
CS 245 Notes 3 30
Search Key in the Clear
trustedprocessor
searchk=2
dbmsQ: k=2
[1, E(b1)][2, E(b2)][3, E(b3)][4, E(b4)]
...• each record is [k,b]• store [k, E(b)]• can search for records with k=x
A: [2, E(b2)]
CS 245 Notes 3 31
Encrypt Key
trustedprocessor
searchk=2
dbmsQ: k’=E(2)
[E(1), E(b1)][E(2), E(b2)][E(3), E(b3)][E(4), E(b4)]
...• each record is [k,b]• store [E(k), E(b)]• can search for records with k=E(x)
A: [E(2), E(b2)]
CS 245 Notes 3 32
Issues• Hard to do range queries• Encryption not good• Better to use encryption that does not
always generate same cyphertext
Ek
DE(k, random) k
simplification
CS 245 Notes 3 33
How Do We Search Now?
trustedprocessor
searchk=2
dbmsQ: k’=E(2)
[E(1, abc), E(b1)][E(2, dhe), E(b2)][E(3, nft), E(b3)][E(2, lkz), E(b4)]
...• each record is [k,b]• store [E(k, rand), E(b)]• can search for records with k=E(x,???)?
A: [E(2,dhe), E(b2)] [E(2, lkz), E(b4)]
???
CS 245 Notes 3 34
Solution?• Develop new decryption function:
D(f(k1), E(k2, rand)) is true if k1=k2
CS 245 Notes 3 35
Solution?• Develop new decryption function:
D(f(k1), E(k2, rand)) is true if k1=k2
trustedprocessor
searchk=2
dbms
Q: check if D(f(2),*) true
[E(1, abc), E(b1)][E(2, dhe), E(b2)][E(3, nft), E(b3)][E(2, lkz), E(b4)]
...
A: [E(2,dhe), E(b2)] [E(2, lkz), E(b4)]
CS 245 Notes 3 36
Issues?• Cannot do non-equality predicates• Hard to build indexes
CS 245 Notes 3 37
Next: placing records into blocks
blocks ...
a file
assume fixedlength blocks
assume a single file (for now)
CS 245 Notes 3 38
(1) separating records(2) spanned vs. unspanned(3) mixed record types – clustering(4) split records(5) sequencing(6) indirection
Options for storing records in blocks:
CS 245 Notes 3 39
Block
(a) no need to separate - fixed size recs.(b) special marker(c) give record lengths (or offsets)
- within each record- in block header
(1) Separating records
R2R1 R3
CS 245 Notes 3 40
• Unspanned: records must be within one blockblock 1 block 2 ...
• Spannedblock 1 block 2 ...
(2) Spanned vs. Unspanned
R1 R2
R1
R3 R4 R5
R2 R3(a)
R3(b) R6R5R4 R7
(a)
CS 245 Notes 3 41
need indication need indication
of partial record of continuation
“pointer” to rest (+ from where?)
R1 R2 R3(a)
R3(b) R6R5R4 R7
(a)
With spanned records:
CS 245 Notes 3 42
• Unspanned is much simpler, but may waste space…
• Spanned essential if record size > block size
Spanned vs. unspanned:
CS 245 Notes 3 43
Example106 recordseach of size 2,050 bytes (fixed)block size = 4096 bytes
block 1 block 2
2050 bytes wasted 2046 2050 bytes wasted 2046
R1 R2
• Total wasted = 2 x 109 Utiliz = 50%• Total space = 4 x 109
CS 245 Notes 3 44
• Mixed - records of different types(e.g. EMPLOYEE, DEPT)allowed in same block
e.g., a block
(3) Mixed record types
EMP e1 DEPT d1 DEPT d2
CS 245 Notes 3 45
Why do we want to mix?Answer: CLUSTERING
Records that are frequently accessed together should bein the same block
CS 245 Notes 3 46
Compromise:
No mixing, but keep relatedrecords in same cylinder ...
CS 245 Notes 3 47
ExampleQ1: select A#, C_NAME, C_CITY, …
from DEPOSIT, CUSTOMERwhere DEPOSIT.C_NAME =
CUSTOMER.C.NAME
a blockCUSTOMER,NAME=SMITH
DEPOSIT,NAME=SMITHDEPOSIT,NAME=SMITH
CS 245 Notes 3 48
• If Q1 frequent, clustering good• But if Q2 frequent
Q2: SELECT * FROM CUSTOMER
CLUSTERING IS COUNTER PRODUCTIVE
CS 245 Notes 3 49
Fixed part in one block
Typically forhybrid format
Variable part in another block
(4) Split records
CS 245 Notes 3 50
Block with fixed recs.
R1 (a)R1 (b)
Block with variable recs.
R2 (a)
R2 (b)
R2 (c)
This block alsohas fixed recs.
CS 245 Notes 3 51
Question
What is difference between- Split records- Simply using two different
record types?
CS 245 Notes 3 52
• Ordering records in file (and block) by some key value
Sequential file ( sequenced)
(5) Sequencing
CS 245 Notes 3 53
Why sequencing?Typically to make it possible to
efficiently read records in order(e.g., to do a merge-join — discussed later)
CS 245 Notes 3 54
Sequencing Options
(a) Next record physically contiguous
...
(b) Linked
Next (R1)R1
R1 Next (R1)
CS 245 Notes 3 55
(c) Overflow area
Recordsin sequence
R1R2R3R4R5
Sequencing Options
headerR2.1R1.3R4.7
CS 245 Notes 3 56
• How does one refer to records?
(6) Indirection
Rx
Many options: Physical Indirect
CS 245 Notes 3 57
Purely Physical
Device IDE.g., Record Cylinder #
Address = Track #or ID Block #
Offset in block
Block ID
CS 245 Notes 3 58
Fully IndirectE.g., Record ID is arbitrary bit string
maprec ID r address
a
Physicaladdr.Rec ID
CS 245 Notes 3 59
Tradeoff
Flexibility Costto move records of indirection(for deletions, insertions)
CS 245 Notes 3 60
Physical Indirect
Many optionsin between …
CS 245 Notes 3 61
Ex #1 Indirection in block Header
A block: Free
space
R3R4R1 R2
CS 245 Notes 3 62
Block header - data at beginning that
describes blockMay contain:
- File ID (or RELATION or DB ID) - This block ID - Record directory
- Pointer to free space- Type of block (e.g. contains recs type 4; is overflow, …)- Pointer to other blocks “like it”- Timestamp ...
CS 245 Notes 3 63
Ex. #2 Use logical block #’s understood by file system
REC ID File ID Block # Record # or Offset
File ID, PhysicalBlock # Block ID
File Syst. Map
CS 245 Notes 3 64
File system map may be “Semi-physical”…
File F1: physical address of block 1table of bad blocks:B57 XXXB107 YYY
Rest can be computed via formula...
CS 245 Notes 3 65
Num. Blocks: 20Start Block: 1000Block Size: 100Bad Blocks:
3 20,0007 15,000
File DEFINITION
Where is Block # 2?Where is Block # 3?
CS 245 Notes 3 66
(1) Separating records(2) Spanned vs. Unspanned(3) Mixed record types - Clustering(4) Split records(5) Sequencing(6) Indirection
Options for storing records in blocks
CS 245 Notes 3 67
(1) Insertion/Deletion(2) Buffer Management(3) Comparison of Schemes
Other Topics
CS 245 Notes 3 68
Block
Deletion
Rx
CS 245 Notes 3 69
Options:(a) Immediately reclaim space(b) Mark deleted
– May need chain of deleted records(for re-use)
– Need a way to mark:• special characters• delete field• in map
CS 245 Notes 3 70
As usual, many tradeoffs...
• How expensive is to move valid record to free space for immediate reclaim?
• How much space is wasted?– e.g., deleted records, delete fields,
free space chains,...
CS 245 Notes 3 71
Dangling pointers
Concern with deletions
R1 ?
CS 245 Notes 3 72
Solution #1: Do not worry
CS 245 Notes 3 73
E.g., Leave “MARK” in map or old location
Solution #2: Tombstones
• Physical IDs A block
This space This space cannever re-used be re-used
CS 245 Notes 3 74
• Logical IDs
ID LOC
7788
map
Never reuseID 7788 nor
space in map...
E.g., Leave “MARK” in map or old location
Solution #2: Tombstones
CS 245 Notes 3 75
• Place record ID within every record• When you follow a pointer, check if
it leads to correct record
Solution #3 (?):
Does this work??? If space reused, won’t new record have same ID?
to3-77
rec-id:3-77
CS 245 Notes 3 76
• To point, use (pointer + hash)or (pointer + key)?
Solution #4 (?):
• What if record modified???
ptr+hash key
CS 245 Notes 3 77
Easy case: records not in sequence Insert new record at end of
file or in deleted slot If records are variable size,
not as easy...
Insert
CS 245 Notes 3 78
Hard case: records in sequence If free space “close by”, not too bad... Or use overflow idea...
Insert
CS 245 Notes 3 79
Interesting problems:• How much free space to leave in each
block, track, cylinder?• How often do I reorganize file +
overflow?
CS 245 Notes 3 80
Freespace
CS 245 Notes 3 81
• DB features needed• Why LRU may be bad Read• Pinned blocks Textbook!• Forced output• Double buffering• Swizzling
Buffer Management
in Notes02
CS 245 Notes 3 82
Swizzling
Memory Disk
Rec A
block 1
Rec Ablock 2 block 2
block 1
CS 245 Notes 3 83
Translation DB Addr Mem Addr Table Rec-A Rec-A-inMem
One Option:
CS 245 Notes 3 84
In memory pointers - need “type” bit
to disk
to memoryM
Another Option:
CS 245 Notes 3 85
Swizzling• Automatic• On-demand• No swizzling / program control
CS 245 Notes 3 86
• There are 10,000,000 ways to organize my data on disk…
Which is right for me?
Comparison
CS 245 Notes 3 87
Issues:
Flexibility Space Utilization
Complexity Performance
CS 245 Notes 3 88
To evaluate a given strategy, compute following parameters:-> space used for expected data-> expected time to- fetch record given key- fetch record with next key- insert record- append record- delete record- update record- read all file- reorganize file
CS 245 Notes 3 89
ExampleHow would you design Megatron 3000
storage system? (for a relational DB, low end)– Variable length records?– Spanned?– What data types?– Fixed format?– Record IDs ?– Sequencing?– How to handle deletions?
CS 245 Notes 3 90
• How to lay out data on diskData Items Records Blocks Files Memory DBMS
Summary
CS 245 Notes 3 91
How to find a record quickly,given a key
Next