gt 4420/6422 // spring 2019 // @joy arulraj lecture #23: …jarulraj/courses/4420-s19/... · 2019....

DATABASE SYSTEM IMPLEMENTATION

GT 4420/6422 // SPRING 2019 // @JOY_ARULRAJ

LECTURE #23: VECTORIZED EXECUTION

ANATOMY OF A DATABASE SYSTEM

Connection Manager + Admission Control

Query Parser

Query Optimizer

Query Executor

Lock Manager (Concurrency Control)

Access Methods (or Indexes)

Buffer Pool Manager

Log Manager

Memory Manager + Disk Manager

Networking Manager

2

QueryTransactional

Storage Manager

Query Processor

Shared Utilities

Process Manager

Source: Anatomy of a Database System

http://cs.brown.edu/courses/cs295-11/2006/anatomyofadatabase.pdf

TODAY’S AGENDA

BackgroundHardwareVectorized Algorithms (Columbia)

3

VECTORIZATION

The process of converting an algorithm's scalar implementation that processes a single pair of operands at a time, to a vector implementation that processes one operation on multiple pairs of operands at once.

4

WHY THIS MATTERS

Say we can parallelize our algorithm over 32 cores.Each core has a 4-wide SIMD registers.

Potential Speed-up: 32x × 4x = 128x

5

MULTI-CORE CPUS

Use a small number of high-powered cores. → Intel Xeon Skylake / Kaby Lake→ High power consumption and area per core.

Massively superscalar and aggressive out-of-order execution→ Instructions are issued from a sequential stream.→ Check for dependencies between instructions.→ Process multiple instructions per clock cycle.

6

MANY INTEGRATED CORES (MIC)

Use a larger number of low-powered cores.→ Intel Xeon Phi→ Low power consumption and area per core.→ Expanded SIMD instructions with larger register sizes.

Knights Ferry (Columbia Paper)→ Non-superscalar and in-order execution→ Cores = Intel P54C (aka Pentium from the 1990s).

Knights Landing (Since 2016)→ Superscalar and out-of-order execution.→ Cores = Silvermont (aka Atom)

7

http://www.intel.com/content/www/us/en/processors/xeon/xeon-phi-detail.htmlhttps://en.wikipedia.org/wiki/P5_(microarchitecture)https://en.wikipedia.org/wiki/Xeon_Phi





8






9


SINGLE INSTRUCTION, MULTIPLE DATA

A class of CPU instructions that allow the processor to perform the same operation on multiple data points simultaneously.

All major ISAs have microarchitecture support SIMD operations.→ x86: MMX, SSE, SSE2, SSE3, SSE4, AVX, AVX2, AVX512→ PowerPC: Altivec→ ARM: NEON

11

SIMD EXAMPLE

12

X + Y = Zx1x2⋮xn

y1y2⋮yn

x1+y1x2+y2⋮

xn+yn

+ =

Z

SIMD EXAMPLE

13

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

14

X + Y = Z 87654321

X

SISD+for (i=0; i

Z

SIMD EXAMPLE

15

X + Y = Z 87654321

X

SISD+for (i=0; i

Z

SIMD EXAMPLE

16

X + Y = Z 87654321

X

SISD+for (i=0; i

Z

SIMD EXAMPLE

17

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

18

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

19

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

20

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

21

X + Y = Z 87654321

X

for (i=0; i

Z

SIMD EXAMPLE

22

X + Y = Z 87654321

X

for (i=0; i

STREAMING SIMD EXTENSIONS (SSE)

SSE is a collection SIMD instructions that target special 128-bit SIMD registers.These registers can be packed with four 32-bit scalars after which an operation can be performed on each of the four elements simultaneously.

First introduced by Intel in 1999.

23

SIMD INSTRUCTIONS (1 )

Data Movement→ Moving data in and out of vector registersArithmetic Operations→ Apply operation on multiple data items (e.g., 2 doubles, 4

floats, 16 bytes)→ Example: ADD, SUB, MUL, DIV, SQRT, MAX, MINLogical Instructions→ Logical operations on multiple data items→ Example: AND, OR, XOR, ANDN, ANDPS, ANDNPS

24

SIMD INSTRUCTIONS (2)

Comparison Instructions→ Comparing multiple data items (==,=,!=)Shuffle instructions→ Move data in between SIMD registersMiscellaneous→ Conversion: Transform data between x86 and SIMD

registers.→ Cache Control: Move data directly from SIMD registers

to memory (bypassing CPU cache).

25

INTEL SIMD EXTENSIONS

26

Width Integers Single-P Double-P1997 MMX 64 bits

1999 SSE 128 bits (×4)

2001 SSE2 128 bits (×2)

2004 SSE3 128 bits

2006 SSSE 3 128 bits

2006 SSE 4.1 128 bits

2008 SSE 4.2 128 bits

2011 AVX 256 bits (×8) (×4)

2013 AVX2 256 bits

2017 AVX-512 512 bits (×16) (×8)Source: James Reinders

https://www.youtube.com/watch?v=_OJmxi4-twY

WHY NOT GPUS?

Moving data back and forth between DRAM and GPU is slow over PCI-E bus.

There are some newer GPU-enabled DBMSs→ Examples: MapD, SQream, Kinetica

Emerging co-processors that can share CPU’s memory may change this.→ Examples: AMD’s APU, Intel’s Knights Landing

27

https://www.mapd.com/http://sqream.com/https://www.kinetica.com/

VECTORIZATION

Choice #1: Automatic Vectorization

Choice #2: Compiler Hints

Choice #3: Explicit Vectorization

28

Source: James Reinders


VECTORIZATION

Choice #1: Automatic Vectorization

Choice #2: Compiler Hints

Choice #3: Explicit Vectorization

29

Source: James Reinders

Ease of Use

ProgrammerControl


AUTOMATIC VECTORIZATION

The compiler can identify when instructions inside of a loop can be rewritten as a vectorizedoperation.

Works for simple loops only and is rare in database operators. Requires hardware support for SIMD instructions.

30


31

void add(int *X,int *Y,int *Z) {

for (int i=0; i


This loop is not legal to automatically vectorize.

The code is written such that the addition is described as being done sequentially.

32


for (int i=0; i




33

These might point to the same address!


for (int i=0; i




34



for (int i=0; i




35



for (int i=0; i

COMPILER HINTS

Provide the compiler with additional information about the code to let it know that is safe to vectorize.

Two approaches:→ Give explicit information about memory locations.→ Tell the compiler to ignore vector dependencies.

36

COMPILER HINTS

The restrict keyword in C++ tells the compiler that the arrays are distinct locations in memory.

37

void add(int *restrict X,int *restrict Y,int *restrict Z) {

for (int i=0; i

COMPILER HINTS

This pragma tells the compiler to ignore loop dependencies for the vectors.

It’s up to you make sure that this is correct.

38


#pragma ivdepfor (int i=0; i

EXPLICIT VECTORIZATION

Use CPU intrinsics to manually marshal data between SIMD registers and execute vectorized instructions.

Potentially not portable.

39


Store the vectors in 128-bit SIMD registers.

Then invoke the intrinsic to add together the vectors and write them to the output location.

40


__mm128i *vecX = (__m128i*)X;__mm128i *vecY = (__m128i*)Y;__mm128i *vecZ = (__m128i*)Z;for (int i=0; i

VECTORIZATION DIRECTION

Approach #1: Horizontal→ Perform operation on all elements

together within a single vector.

Approach #2: Vertical→ Perform operation in an elementwise

manner on elements of each vector.

41

Source: Przemysław Karpiński

https://gain-performance.com/2017/05/01/umesimd-tutorial-2-calculation/






42


0 1 2 3

SIMD Add 6







43


0 1 2 3

SIMD Add 6

0 1 2 3

SIMD Add

1 1 1 1

1 2 3 4



Linear Access Operators→ Predicate evaluation→ Compression

Ad-hoc Vectorization→ Sorting→ Merging

Composable Operations→ Multi-way trees→ Bucketized hash tables

44

Source: Orestis Polychroniou

http://www.cs.columbia.edu/~orestis

VECTORIZED DBMS ALGORITHMS

Principles for efficient vectorization by using fundamental vector operations to construct more advanced functionality.→ Favor vertical vectorization by processing different input

data per lane.→ Maximize lane utilization by executing different things

per lane subset.

45

RETHINKING SIMD VECTORIZATION FOR IN-MEMORY DATABASESSIGMOD 2015

FUNDAMENTAL OPERATIONS

Selective LoadSelective StoreSelective GatherSelective Scatter

46

FUNDAMENTAL VECTOR OPERATIONS

47

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •


48

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •


49

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •


50

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •


51

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •


52

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U


53

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U


54

Selective Load

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V


55

Selective Load Selective Store

A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector

U V W X Y Z • • •Memory

0 1 0 1Mask


56


A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector


0 1 0 1Mask


57


A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector


0 1 0 1Mask


58


A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector


0 1 0 1Mask

B


59


A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector


0 1 0 1Mask

B


60


A B C DVector

Memory

0 1 0 1Mask

U V W X Y Z • • •

U V

A B C DVector


0 1 0 1Mask

B D

Selective Gather


61

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CA

Selective Gather


62

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CA

Selective Gather


63

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CA

0 21 3 54

Selective Gather


64

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CA

0 21 3 54

Selective Gather


65

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CAW

0 21 3 54

Selective Gather


66

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • •

CAW V XZ

0 21 3 54

Selective Gather


67

Selective Scatter

A B DValue Vector

Memory

2 1 5 3Index Vector

U V W X Y Z • • • A B C DValue Vector


2 1 5 3Index Vector

CAW V XZ

0 21 3 54

Selective Gather


68

Selective Scatter

A B DValue Vector

Memory

2 1 5 3Index Vector



2 1 5 3Index Vector

CAW V XZ

0 21 3 54

Selective Gather


69

Selective Scatter

A B DValue Vector

Memory

2 1 5 3Index Vector



2 1 5 3Index Vector

CAW V XZ

0 21 3 54

0 21 3 54

Selective Gather


70

Selective Scatter

A B DValue Vector

Memory

2 1 5 3Index Vector



2 1 5 3Index Vector

CAW V XZ

0 21 3 54

0 21 3 54

Selective Gather


71

Selective Scatter

A B DValue Vector

Memory

2 1 5 3Index Vector



2 1 5 3Index Vector

CAW V XZ B A CD

0 21 3 54

0 21 3 54

ISSUES

Gathers and scatters are not really executed in parallel because the L1 cache only allows one or two distinct accesses per cycle.

Gathers are only supported in newer CPUs.Selective loads and stores are also emulated in Xeon CPUs using vector permutations.

72

https://software.intel.com/en-us/node/683481

VECTORIZED OPERATORS

Selection ScansHash TablesPartitioning

Paper provides additional info:→ Joins, Sorting, Bloom filters.

73

RETHINKING SIMD VECTORIZATION FOR IN-MEMORY DATABASESSIGMOD 2015

SELECTION SCANS

74

SELECT * FROM tableWHERE key >= $(low)AND key

SELECTION SCANS

75

Scalar (Branching)

i = 0for t in table:

key = t.keyif (key≥low) && (key≤high):copy(t, output[i])i = i + 1

SELECTION SCANS

76

Scalar (Branching)



SELECTION SCANS

77

Scalar (Branching)



Scalar (Branchless)

i = 0for t in table:copy(t, output[i])key = t.keym = (key≥low ? 1 : 0) &&

(key≤high ? 1 : 0)i = i + m

SELECTION SCANS

78

Scalar (Branching)



Scalar (Branchless)


(key≤high ? 1 : 0)i = i + m

SELECTION SCANS

79

Scalar (Branching)



Scalar (Branchless)


(key≤high ? 1 : 0)i = i + m

Source: Bogdan Raducanu

http://15721.courses.cs.cmu.edu/spring2017/papers/20-compilation/p1231-raducanu.pdf

SELECTION SCANS

80

Vectorized

i = 0for vt in table:

simdLoad(vt.key, vk)vm = (vk≥low ? 1 : 0) &&

(vk≤high ? 1 : 0)simdStore(vt, vm, output[i])i = i + |vm≠false|

SELECTION SCANS

81

Vectorized




SELECTION SCANS

82

Vectorized




SELECTION SCANS

83

Vectorized




SELECTION SCANS

84

Vectorized




SELECTION SCANS

85

Vectorized




SELECTION SCANS

86

Vectorized




SELECT * FROM tableWHERE key >= "O" AND key

SELECTION SCANS

87

Vectorized




ID1

KEYJ

2 O3 Y4 S5 U6 X


SELECTION SCANS

88

Vectorized




J O Y S U XKey VectorID1

KEYJ

2 O3 Y4 S5 U6 X


SELECTION SCANS

89

Vectorized





KEYJ

2 O3 Y4 S5 U6 X

Mask 0 1 0 1 1 0

SIMD Compare


SELECTION SCANS

90

Vectorized





KEYJ

2 O3 Y4 S5 U6 X

Mask 0 1 0 1 1 0

SIMD Compare

0 1 2 3 4 5All Offsets


SELECTION SCANS

91

Vectorized





KEYJ

2 O3 Y4 S5 U6 X

Mask 0 1 0 1 1 0

SIMD Compare

0 1 2 3 4 5All Offsets

SIMD Store

1 3 4Matched OffsetsSELECT * FROM tableWHERE key >= "O" AND key

SELECTION SCANS

92

Scalar (Branching)Scalar (Branchless)

Vectorized (Early Mat)Vectorized (Late Mat)

MIC (Xeon Phi 7120P – 61 Cores + 4×HT) Multi-Core (Xeon E3-1275v3 – 4 Cores + 2×HT)

SELECTION SCANS

93

0

16

32

48

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Selectivity (%)



0.0

2.0

4.0

6.0

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Selectivity (%)


5.7 5.6 5.35.7 4.9 4.3 2.8 1.3

1.7 1.7 1.71.8 1.6 1.4 1.5 1.2

SELECTION SCANS

94

0

16

32

48

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Selectivity (%)



0.0

2.0

4.0

6.0

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Selectivity (%)


SELECTION SCANS

95

0

16

32

48

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Selectivity (%)



0.0

2.0

4.0

6.0

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Selectivity (%)


MemoryBandwidth

MemoryBandwidth

SELECTION SCANS

96

0

16

32

48

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Selectivity (%)



0.0

2.0

4.0

6.0

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Selectivity (%)


SELECTION SCANS

97

0

16

32

48

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Selectivity (%)



0.0

2.0

4.0

6.0

0 1 2 5 10 20 50 100

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Selectivity (%)


PAYLOADKEY

Linear Probing Hash Table

HASH TABLES – PROBING

98

PAYLOADKEY



99

Scalar

k1Input Key

PAYLOADKEY



100

Scalar

k1Input Key

h1Hash Index

#hash(key)

PAYLOADKEY



101

Scalar

k1Input Key

h1Hash Index

#hash(key)

k1 k9=

PAYLOADKEY



102

Scalar

k1Input Key

h1Hash Index

#hash(key)

k1 k9=

k3=

k8=

k1=


103

Scalar

k1Input Key

h1Hash Index

#hash(key)

Vectorized (Horizontal)

KEYS PAYLOAD

Linear Probing Bucketized Hash Table


104

Scalar

k1Input Key

h1Hash Index

#hash(key)


KEYS PAYLOAD


k1Input Key

h1Hash Index

#hash(key)


105

Scalar

k1Input Key

h1Hash Index

#hash(key)


KEYS PAYLOAD


k1Input Key

h1Hash Index

#hash(key)

k9= k3 k8 k1k1


106

Scalar

k1Input Key

h1Hash Index

#hash(key)


KEYS PAYLOAD


k1Input Key

h1Hash Index

#hash(key)

k9= k3 k8 k1k1

0 0 0 1Matched Mask

SIMD Compare

PAYLOADk99

k1

k6

k4

KEY

k5

k88



107

Vectorized (Vertical)Input Key

Vector

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



108


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



109


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k1k99k88k4

====

SIMD Gather

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



110


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k1k99k88k4

====

SIMD Compare

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



111


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k1k99k88k4

====

SIMD Compare

1001

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



112


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k1k99k88k4

====

SIMD Compare

1001

k1k2k3k4

PAYLOADk99

k1

k6

k4

KEY

k5

k88



113


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k1k99k88k4

====

SIMD Compare

1001

k1k2k3k4

k5

k6

h5h2+1h3+1h6

PAYLOADk99

k1

k6

k4

KEY

k5

k88



114


Vector hash(key)

####

Hash IndexVector

h1h2h3h4

k1k2k3k4

k5

k6

h5h2+1h3+1h6


115

Scalar Vectorized (Horizontal) Vectorized (Vertical)MIC (Xeon Phi 7120P – 61 Cores + 4×HT) Multi-Core (Xeon E3-1275v3 – 4 Cores + 2×HT)


116

0

3

6

9

12

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Hash Table Size


0

0.5

1

1.5

2

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Hash Table Size


117

0

3

6

9

12

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Hash Table Size


0

0.5

1

1.5

2

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Hash Table Size

2.3 2.2 2.12.4 1.1 0.9 0.7 0.6

1.1 1.10.9

1.2

0.8 0.8

0.30.2


118

0

3

6

9

12

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Hash Table Size


0

0.5

1

1.5

2

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Hash Table Size


119

0

3

6

9

12

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)

Hash Table Size


0

0.5

1

1.5

2

4KB 16

KB 64

KB256

KB 1M

B 4M

B 16

MB 64

MB

Thro

ughp

ut(b

illio

n tu

ples

/ se

c)Hash Table Size

Out of Cache

Out of Cache

PARTITIONING – HISTOGRAM

Use scatter and gathers to increment counts.Replicate the histogram to handle collisions.

120



121

k1k2k3k4

Input KeyVector



122

k1k2k3k4

Input KeyVector

h1h2h3h4

Hash Index Vector

SIMD AddSIMD Radix

+1+1

+1

Histogram



123

k1k2k3k4

Input KeyVector

h1h2h3h4

Hash Index Vector

SIMD AddSIMD Radix

+1+1

+1

Histogram



124

k1k2k3k4

Input KeyVector

h1h2h3h4

Hash Index Vector Replicated Histogram

+1+1

+1

+1

SIMD Radix SIMD Scatter



125

k1k2k3k4

Input KeyVector

h1h2h3h4


+1+1

+1

+1

# of Vector LanesSIMD Radix SIMD Scatter



126

k1k2k3k4

Input KeyVector

h1h2h3h4


+1+1

+1

+1

SIMD Add

# of Vector LanesSIMD Radix

+1+2

+1

Histogram

SIMD Scatter

JOINS

No Partitioning→ Build one shared hash table using atomics→ Partially vectorizedMin Partitioning→ Partition building table→ Build one hash table per thread→ Fully vectorizedMax Partitioning→ Partition both tables repeatedly→ Build and probe cache-resident hash tables→ Fully vectorized

127

JOINS

128

0

0.5

1

1.5

2

Scalar Vector Scalar Vector Scalar VectorNo Partitioning Min Partitioning Max Partitioning

Join

Tim

e (s

ec)

Partition Build Probe Build+Probe

200M ⨝ 200M tuples (32-bit keys & payloads)Xeon Phi 7120P – 61 Cores + 4×HT

PARTING THOUGHTS

Vectorization is essential for OLAP queries.These algorithms don’t work when the data exceeds your CPU cache.

We can combine all the intra-query parallelism optimizations we’ve talked about in a DBMS.→ Multiple threads processing the same query.→ Each thread can execute a compiled plan.→ The compiled plan can invoke vectorized operations.

129

NEXT CLASS

Reminder: No class on 4/18 (Thu).

Reminder: Guest lecture on 4/23 (Tue).

Reminder: Extra credit due on 4/18 (Thu).

Reminder: Final presentation on 4/25 (Thu).

130

gt 4420/6422 // spring 2019 // @joy arulraj lecture #23: …jarulraj/courses/4420-s19/... · 2019....

Documents