concurrency: intro and threads...l23: intro to concurrency cse333, winter 2020 sequential can be...
TRANSCRIPT
CSE333, Winter 2020L23: Intro to Concurrency
Guest Instructor: Travis McGaha
Teaching Assistants:
Andrew Hu Austin Chan Brennan Stein
Cheng Ni Cosmo Wang Diya Joy
Guramrit Singh Mengqi Cen Pat Kosakanchit
Rehaan Bhimani Renshu Gu Travis McGaha
Zachary Keyes
Concurrency: Intro and ThreadsCSE 333 Winter 2020
CSE333, Winter 2020L23: Intro to Concurrency
Administrivia
❖ HW4 due two Thursdays from now (03/12)
▪ You can use two late days on HW4.
❖ Exercise 17 to be released Friday.
▪ Due Monday 3/09 @ 11 am
▪ 🎉 The Last Exercise 🎉
2
CSE333, Winter 2020L23: Intro to Concurrency
Some Common HW4 Bugs
❖ Your server works, but is really, really slow▪ Check the 2nd argument to the QueryProcessor constructor
❖ Funny things happen after the first request▪ Make sure you’re not destroying the HTTPConnection object
too early (e.g. falling out of scope in a while loop)
❖ Server crashes on a blank request▪ Make sure that you handle the case that read() (or WrappedRead()) returns 0
3
CSE333, Winter 2020L23: Intro to Concurrency
Lecture Outline
❖ From Query Processing to a Search Server
❖ Intro to Concurrency
❖ Threads and other concurrency methods
❖ Search Server with pthreads
4
CSE333, Winter 2020L23: Intro to Concurrency
Building a Web Search Engine
❖ We have:
▪ A web index
• A map from <word> to <list of documents containing the word>
• This is probably sharded over multiple files
▪ A query processor
• Accepts a query composed of multiple words
• Looks up each word in the index
• Merges the result from each word into an overall result set
5
CSE333, Winter 2020L23: Intro to Concurrency
Search Engine Architecture
6
query processor
clientindex
file
index file
index file
CSE333, Winter 2020L23: Intro to Concurrency
Search Engine (Pseudocode)
7
doclist Lookup(string word) {
bucket = hash(word);
hitlist = file.read(bucket);
foreach hit in hitlist {
doclist.append(file.read(hit));
}
return doclist;
}
main() {
SetupServerToReceiveConnections();
while (1) {
string query_words[] = GetNextQuery();
results = Lookup(query_words[0]);
foreach word in query[1..n] {
results = results.intersect(Lookup(word));
}
Display(results);
}
}
CSE333, Winter 2020L23: Intro to Concurrency
Execution Timeline: a Multi-Word Query
8
network I/O
main()
GetNextQuery()
disk I/O
Lookup()
disk I/O
Lookup()
disk I/O
Lookup()
network I/O
Display()
GetNextQuery()
• • •
time
query
CPU
CPU
results.intersect()
results.intersect()
CSE333, Winter 2020L23: Intro to Concurrency
What About I/O-caused Latency?
❖ Jeff Dean’s “Numbers Everyone Should Know” (LADIS ‘09)
9
CSE333, Winter 2020L23: Intro to Concurrency
Execution Timeline: To Scale
10
network I/O
main()
disk I/O
disk I/O
disk I/O
• • •
time
query
network I/O
CPU
CPU
CSE333, Winter 2020L23: Intro to Concurrency
Multiple (Single-Word) Queries
11
CPU 1.a
I/O 1.b
CPU 1.c
I/O 1.d
CPU 1.e
CPU 2.a
I/O 2.b
CPU 2.c
I/O 2.d
CPU 2.e
CPU 3.a
I/O 3.b
CPU 3.c
I/O 3.d
CPU 3.e
time
query 2
query 3
query 1
# is the Query Number#.a -> GetNextQuery()#.b -> network I/O#.c -> Lookup() & file.read()#.d -> Disk I/O#.e -> Intersect()
& Display()
CSE333, Winter 2020L23: Intro to Concurrency
Multiple Queries: To Scale
12
I/O 1.b
I/O 1.d
time
query 2
query 1
I/O 1.b
I/O 1.d
I/O 1.b
I/O 1.d
query 3
CSE333, Winter 2020L23: Intro to Concurrency
Uh-Oh (1 of 2)
13
query processor
client
client
client
client
client
index file
index file
index file
CSE333, Winter 2020L23: Intro to Concurrency
Uh-Oh (2 of 2)
14
CPU 1.a
I/O 1.b
CPU 1.c
I/O 1.d
CPU 1.e
CPU 2.a
I/O 2.b
CPU 2.c
I/O 2.d
CPU 2.e
CPU 3.a
I/O 3.b
CPU 3.c
I/O 3.d
CPU 3.e
time
query 2
query 3
query 1
The CPU is idle most of the time!
(picture not to scale)
Only one I/O request at a time is “in flight”
Queries don’t run until earlier queries finish
CSE333, Winter 2020L23: Intro to Concurrency
Sequential Can Be Inefficient
❖ Only one query is being processed at a time
▪ All other queries queue up behind the first one
▪ And clients queue up behind the queries …
❖ Even while processing one query, the CPU is idle the vast majority of the time
▪ It is blocked waiting for I/O to complete
• Disk I/O can be very, very slow (10 million times slower …)
❖ At most one I/O operation is in flight at a time
▪ Missed opportunities to speed I/O up
• Separate devices in parallel, better scheduling of a single device, etc.
15
CSE333, Winter 2020L23: Intro to Concurrency
Lecture Outline
❖ From Query Processing to a Search Server
❖ Intro to Concurrency
❖ Concurrent Programming Styles
❖ Threads
❖ Search Server with pthreads
16
CSE333, Winter 2020L23: Intro to Concurrency
Concurrency
❖ Our search engine could run concurrently:
▪ Example: Execute queries one at a time, but issue I/O requestsagainst different files/disks simultaneously
• Could read from several index files at once, processing the I/O results as they arrive
▪ Example: Our web server could execute multiple queries at the same time
• While one is waiting for I/O, another can be executing on the CPU
❖ Concurrency != parallelism
▪ Concurrency is doing multiple tasks at a time
▪ Parallelism is executing multiple CPU instructions simultaneously
17
CSE333, Winter 2020L23: Intro to Concurrency
A Concurrent Implementation
❖ Use multiple “workers”
▪ As a query arrives, create a new “worker” to handle it
• The “worker” reads the query from the network, issues read requests against files, assembles results and writes to the network
• The “worker” uses blocking I/O; the “worker” alternates between consuming CPU cycles and blocking on I/O
▪ The OS context switches between “workers”
• While one is blocked on I/O, another can use the CPU
• Multiple “workers’” I/O requests can be issued at once
❖ So what should we use for our “workers”?
18
CSE333, Winter 2020L23: Intro to Concurrency
Lecture Outline
❖ From Query Processing to a Search Server
❖ Intro to Concurrency
❖ Threads and other concurrency methods
❖ Search Server with pthreads
19
CSE333, Winter 2020L23: Intro to Concurrency
Review: Processes
❖ The components of a “process” are:
▪ Resources such as file descriptors and sockets
▪ An address space (page tables, ect.)
❖ Different Processes have independent components:
▪ Most importantly: Isolated address spaces.
❖ An address space of a process can hold stack(s) that distinguish different “threads” of execution
20
CSE333, Winter 2020L23: Intro to Concurrency
Introducing Threads
❖ Separate the concept of a process from the “thread of execution”
▪ Threads are contained within a process
▪ Usually called a thread, this is a sequential execution stream within a process
❖ In most modern OS’s:
▪ Threads are the unit of scheduling.
21
thread
CSE333, Winter 2020L23: Intro to Concurrency
Multi-threaded Search Engine (Pseudocode)
22
doclist Lookup(string word) {
bucket = hash(word);
hitlist = file.read(bucket);
foreach hit in hitlist
doclist.append(file.read(hit));
return doclist;
}
ProcessQuery(string query_words[]) {
results = Lookup(query_words[0]);
foreach word in query[1..n]
results = results.intersect(Lookup(word));
Display(results);
}
main() {
while (1) {
string query_words[] = GetNextQuery();
CreateThread(ProcessQuery(query_words));
}
}
CSE333, Winter 2020L23: Intro to Concurrency
Multi-threaded Search Engine (Execution)
23
CPU 1.a
I/O 1.b
CPU 1.c
I/O 1.d
CPU 1.e
CPU 2.a
I/O 2.b
CPU 3.a
I/O 3.b
CPU 3.c
I/O 3.d
CPU 3.e
time
query 2
query 3
query 1
CPU 2.c
I/O 2.d
CPU 2.e
CSE333, Winter 2020L23: Intro to Concurrency
Why Threads?
❖ Advantages:
▪ You (mostly) write sequential-looking code
▪ Threads can run in parallel if you have multiple CPUs/cores
❖ Disadvantages:
▪ If threads share data, you need locks or other synchronization
• Very bug-prone and difficult to debug
▪ Threads can introduce overhead
• Lock contention, context switch overhead, and other issues
▪ Need language support for threads
24
CSE333, Winter 2020L23: Intro to Concurrency
Threads vs. Processes
❖ In most modern OS’s:
▪ A Process has a unique: address space, OS resources, & security attributes
▪ A Thread has a unique: stack, stack pointer, program counter,& registers
▪ Threads are the unit of scheduling and processes are their containers; every process has at least one thread running in it
25
CSE333, Winter 2020L23: Intro to Concurrency
Threads vs. Processes
26
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
pthread_create()
Stackchild
CSE333, Winter 2020L23: Intro to Concurrency
Threads vs. Processes
27
OS kernel [protected]
Stackchild
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
fork()
CSE333, Winter 2020L23: Intro to Concurrency
Alternative: Processes
❖ What if we forked processes instead of threads?
❖ Advantages:
▪ No shared memory between processes
▪ No need for language support; OS provides “fork”
▪ Processes are isolated. If one crashes, other processes keep going
❖ Disadvantages:
▪ More overhead than threads during creation and context switching
▪ Cannot easily share memory between processes – typically communicate through the file system
28
CSE333, Winter 2020L23: Intro to Concurrency
Alternate: Different I/O Handling
❖ Use asynchronous or non-blocking I/O
❖ Your program begins processing a query
▪ When your program needs to read data to make further progress, it registers interest in the data with the OS and then switches to a different query
▪ The OS handles the details of issuing the read on the disk, or waiting for data from the console (or other devices, like the network)
▪ When data becomes available, the OS lets your program know
❖ Your program (almost never) blocks on I/O
29
CSE333, Winter 2020L23: Intro to Concurrency
Non-blocking I/O
❖ Reading from the network can truly block your program
▪ Remote computer may wait arbitrarily long before sending data
❖ Non-blocking I/O (network, console)
▪ Your program enables non-blocking I/O on its file descriptors
▪ Your program issues read() and write() system calls
• If the read/write would block, the system call returns immediately
▪ Program can ask the OS which file descriptors are readable/writeable
• Program can choose to block while no file descriptors are ready
30
CSE333, Winter 2020L23: Intro to Concurrency
Outline (next two lectures)
❖ We’ll look at different searchserver implementations
▪ Sequential
▪ Concurrent via dispatching threads – pthread_create()
▪ Concurrent via forking processes – fork()
• 🎉Lecture With Andrew Hu!🎉
❖ Reference: Computer Systems: A Programmer’s Perspective, Chapter 12 (CSE 351 book)
31
CSE333, Winter 2020L23: Intro to Concurrency
Sequential
❖ Pseudocode:
❖ See searchserver_sequential/
32
listen_fd = Listen(port);
while (1) {
client_fd = accept(listen_fd);
buf = read(client_fd);
resp = ProcessQuery(buf);
write(client_fd, resp);
close(client_fd);
}
CSE333, Winter 2020L23: Intro to Concurrency
Why Sequential?
❖ Advantages:
▪ Super(?) simple to build/write
❖ Disadvantages:
▪ Incredibly poor performance
• One slow client will cause all others to block
• Poor utilization of resources (CPU, network, disk)
33
CSE333, Winter 2020L23: Intro to Concurrency
Threads
❖ Threads are like lightweight processes
▪ They execute concurrently like processes
• Multiple threads can run simultaneously on multiple CPUs/cores
▪ Unlike processes, threads cohabitate the same address space
• Threads within a process see the same heap and globals and can communicate with each other through variables and memory
– But, they can interfere with each other – need synchronization for shared resources
• Each thread has its own stack
34
CSE333, Winter 2020L23: Intro to Concurrency
Single-Threaded Address Spaces
❖ Before creating a thread
▪ One thread of execution running in the address space
• One PC, stack, SP
▪ That main thread invokes a function to create a new thread
• Typically pthread_create()
35
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
SPparent
PCparent
CSE333, Winter 2020L23: Intro to Concurrency
Multi-threaded Address Spaces
❖ After creating a thread
▪ Two threads of execution running in the address space
• Original thread (parent) and new thread (child)
• New stack created for child thread
• Child thread has its own values of the PC and SP
▪ Both threads share the other segments (code, heap, globals)
• They can cooperatively modify shared data
36
OS kernel [protected]
Stackparent
Heap (malloc/free)
Read/Write Segments.data, .bss
Shared Libraries
Read-Only Segments.text, .rodata
SPparent
PCparent
StackchildSPchild
PCchild
CSE333, Winter 2020L23: Intro to Concurrency
Lecture Outline
❖ From Query Processing to a Search Server
❖ Intro to Concurrency
❖ Threads
❖ Search Server with pthreads
37
CSE333, Winter 2020L23: Intro to Concurrency
POSIX Threads (pthreads)
❖ The POSIX APIs for dealing with threads▪ Declared in pthread.h
• Not part of the C/C++ language (cf. Java)
▪ To enable support for multithreading, must include -pthreadflag when compiling and linking with gcc command
• gcc –g –Wall –std=c11 –pthread –o main main.c
38
CSE333, Winter 2020L23: Intro to Concurrency
Creating and Terminating Threads
❖
▪ Creates a new thread into *thread, with attributes *attr(NULL means default attributes)
▪ Returns 0 on success and an error number on error (can check against error constants)
▪ The new thread runs start_routine(arg)
❖
▪ Equivalent of exit(retval); for a thread instead of a process
▪ The thread will automatically exit once it returns from start_routine()
39
int pthread_create(
pthread_t* thread,
const pthread_attr_t* attr,
void* (*start_routine)(void*),
void* arg);
void pthread_exit(void* retval);