external memory i - technical university of...

22
Philip Bille External Memory I Computational Models Scanning Sorting Searching

Upload: others

Post on 24-Jan-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

Philip Bille

External Memory I

• Computational Models• Scanning• Sorting• Searching

Page 2: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Memory I

• Computational Models• Scanning• Sorting• Searching

Page 3: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

Computational Models

• (word) RAM Model• Infinite memory of w-bit memory cells• Instructions: Memory access, arithmetic operations, boolean operations, control-

flow operations, etc. • Complexity model.

• Time = number of instructions. • Space = number of memory cells used.

CPU Main memory

Page 4: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

• iMac (late 2017)• CPU: 3.5 Ghz Core i5 (4 cores)• Registers: ?• L1 cache: ?• L2 cache: 256k per core• L3 cache: 6 MB shared• Memory: 8 GB• Disk: 1 Tb, (32 Gb SSD + 1Tb hard drive)• Instructions: Memory access, arithmetic operations, boolean operations, control-

flow operations, etc. • Complexity?

Computational Models

CPU

Level-1 cache

Level-2Cache DiskRegisters Level-3

Cache Main

memory Cloud

Page 5: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

Computational Models

• I/O model [Aggarwal and Vitter 1988].• Limited memory, Infinite disk• Instructions: Disk I/O operations, memory access, arithmetic operations, boolean

operations, control-flow operations, etc. • Complexity model.

• I/Os = Number of disk I/Os• Computation is free (!)

CPURAM Disk

I/ON = problem sizeM = memory sizeB = block size

Page 6: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Memory I

• Computational Models• Scanning• Sorting• Searching

Page 7: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

Scanning

33 4 25 28 45 18 7 12 36 1 47 42 50 16 ...

• Scanning. Given an array A of N values stored in N/B blocks and a key x, determine if x is in A.

• I/Os. O(N/B).

Page 8: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Memory I

• Computational Models• Scanning• Sorting• Searching

Page 9: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

• Sorting. Given array A of N values (stored in N/B consecutive blocks), output the values in increasing order.

Sorting

33 4 25 28 45 18 7 12 36 1 47 42 50 16 31

1 4 7 12 16 18 25 28 31 33 36 42 45 47 50

Page 10: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

• Goal. Sorting in O(N/B logM/B (N/B)) I/Os.• Solution in 3 steps.

• Base case. • External multi-way merge.• External merge sort.

External Merge Sort

Page 11: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Merge Sort

• Base case. • Partition N elements into N/M arrays of size M. • Load each into memory and sort.

• I/Os. O(N/B)

N

M

Page 12: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Merge Sort

MB

B

• Multiway merge algorithm.• N elements in M/B arrays.• Load M/B first blocks into memory and sort. • Output B smallest elements. • Load more blocks into memory if needed.• Repeat

• I/Os. O(N/B).

Page 13: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Merge Sort

• Algorithm.• Partition N elements into N/M arrays of size M. Load each into memory and sort.• Apply M/B way external multiway merge until left with single sorted array.

• I/Os. • Sort N/M arrays: O(N/B) I/Os• Height of tree O(logM/B(N/M))• Cost per level: O(N/B) I/Os.

N

M

M/B

M/B

M/B M/B

Total I/Os: O ( NB

logM/BNM ) = O ( N

BlogM/B

NB )

Page 14: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Memory I

• Computational Models• Scanning• Sorting• Searching

Page 15: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

• Searching. Maintain a set S ⊆ U = {0, ..., u-1} supporting• member(x): determine if x ∈ S• predecessor(x): return largest element in S ≤ x.• successor(x): return smallest element in S ≥ x.• insert(x): set S = S ∪ {x} • delete(x): set S = S - {x}

Searching

0 u-1

xpredecessor(x) successor(x)

Page 16: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

• Applications.• Relational data bases. • File systems.

Searching

Page 17: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

B-tree

• B-tree of order δ = ϴ(B) with N keys.• Keys in leaves. Routing elements in internal nodes. • Degree between δ/2 and δ.• Root degree between 2 and δ.• Leaves store between δ/2 and δ keys. • All leaves have the same depth.

• Height. ϴ(logδ (N/B)) = ϴ(logB N)

10 42 70

..10 11..42 43..70 71..

δ = 4 δ = 4

Page 18: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

B-tree

• Searching.• Find leaf using routing elements.

• I/Os. O(logB N).

δ = 4

Page 19: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

B-tree

• Insertion.• Find leaf. • Insert key. • Split nodes on path.

• I/Os. O(logB N).

δ = 4

split

Page 20: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

B-tree

• Deletion.• Find leaf. • Delete key. • Share or fuse nodes on path.

• I/Os. O(logB N).

δ = 4

share

fuse

Page 21: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

Basic Bounds

Internal External

Scanning O(N) scan(N) = O(N/B)

Sorting O(Nlog N) sort(N) = O((N/B)logM/B (N/B))

Searching O(log N) search(N) = O(logB(N))

Page 22: External Memory I - Technical University of Denmarkcourses.compute.dtu.dk/02289/2020/externalmemoryI/... · 2020. 8. 28. · 33 4 25 28 45 18 7 12 36 1 47 42 50 16 31 1 4 7 12 16

External Memory I

• Computational Models• Scanning• Sorting• Searching