improving server applications with system transactions

34
Improving Server Applications with System Transactions Sangman Kim, Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel, Donald E. Porter 1

Upload: collin

Post on 24-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Sangman Kim , Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel , Donald E. Porter. Improving Server Applications with System Transactions. Poor OS API Support for Concurrency. Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Improving Server Applications with System Transactions

1

Improving Server Applications with System Transactions

Sangman Kim, Michael Z. Lee, Alan M. Dunn, Owen S. Hofmann, Xuan Wang, Emmett Witchel, Donald E. Porter

Page 2: Improving Server Applications with System Transactions

Poor OS API Support for Concurrency

2

Para

llelis

m

Maintainability

Fine-grained locking - Bug-prone, hard to maintain - OS provides poor support

Coarse-grained locking - Reduced resource utilization

Page 3: Improving Server Applications with System Transactions

System Transaction Improves OS API Concurrency

3

Para

llelis

m

Server Applications

working with OS API

System Transactio

n

Server Applications

working with OS API

Maintainability

Page 4: Improving Server Applications with System Transactions

4

LinuxTxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,

pipes)System transaction in TxOS

TxOS

ApplicationJVM

Middleware state sharing with multithreading

TxOS system callsMiddleware state sharing

Page 5: Improving Server Applications with System Transactions

5

TxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files ,

pipes)Synchronization in legacy code

ApplicationJVM

TxOS system calls

Synchronization primitivesMiddleware state sharing

Page 6: Improving Server Applications with System Transactions

TxOS

Improving System Transactions

TxOS provides operating system transaction [Porter et al., SOSP 2009] Transaction for OS objects (e.g., files,

pipes) TxOS+: Improved system

transactions

6

ApplicationJVM

TxOS system calls

TxOS+TxOS+: pause/resume,commit ordering, and more

Up to 88% throughput improvement

At most 40 application line changes

Synchronization primitivesMiddleware state sharing

Page 7: Improving Server Applications with System Transactions

7

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 8: Improving Server Applications with System Transactions

Background: System Transaction Transaction Interface and semantics

System calls: xbegin(), xend(), xabort()

ACID semantics ▪ Atomic – all or nothing▪ Consistent – one consistent state to another▪ Isolated – updates as if only one concurrent transaction▪ Durable – committed transactions on disk

Optimistic concurrency control

Fix synchronization issues with OS APIs8

Page 9: Improving Server Applications with System Transactions

Background: System Transaction

Lazy versioning: speculative copy for data

TxOS requires no special hardware9

xbegin();write(f, buf);xend(); CommitAbort

Conflict!inode

header

inode iinumlock

inode datasize

mode…

Copy of inode data

Page 10: Improving Server Applications with System Transactions

10

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 11: Improving Server Applications with System Transactions

11

Applications Parallelized with OS Transactions

Parallelizing applications that synchronize on OS state

Example 1: State-machine replication Constraint: Deterministic state update

Example 2: IMAP Email Server Constraint: Consistent file system

operations

Page 12: Improving Server Applications with System Transactions

12

Example 1: Parallelizing State-machine Replication

Core component of fault tolerant services e.g., Chubby, Zookeeper, Autopilot

Replicas execute the same sequence of operations Often single-threaded to avoid non-determinism

Ordered transaction Makes parallel OS state updates deterministic Applications determine commit order of

transactions

Page 13: Improving Server Applications with System Transactions

13

Example 2: Parallelizing IMAP Email ServersEveryone has concurrent email

clients Desktop, laptop, tablets, phones, .... Need concurrent access to stored emails

Brief history of email storage formats mbox: single file, file locking Lockless Maildir Dovecot Maildir: return of file locking

Page 14: Improving Server Applications with System Transactions

14

mbox Single file mailbox of email messages

Synchronization with file-locking▪ One of fcntl(), flock(), lock file (.mbox.lock)▪ Very coarse-grained locking

mbox: Database Without Parallelism

~/.mboxFrom MAILER-DAEMON Wed Apr 11 09:32:28 2012 From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: mbox needs file lock. Maildir hides message.…..From MAILER-DAEMON Wed Apr 11 09:34:51 2012From: Sangman Kim <[email protected]> To: EuroSys 2012 audienceSubject: System transactions good, file locks bad!….

Page 15: Improving Server Applications with System Transactions

15

Maildir: Parallelism Through Lockless Design Maildir: Lockless alternative to mbox

Directories of message files Each file contains a message Directory access with no synchronization

(originally)

Message filenames contain flagsMaildir/cur 00000000.00201.host:2,T00001000.00305.host:2,R

00002000.02619.host:2,T00010000.08919.host:2,S00015000.10019.host:2,S

TrashedRepliedTrashedSeenSeen

Page 16: Improving Server Applications with System Transactions

16

Messages Hidden with Lockless Maildir

PROCESS 2 (MARKING)

if (access(“043:2,S”)):

rename(“043:2,S”, “043:2,R”)

PROCESS 1 (LISTING)

while (f = readdir(“Maildir/cur”)):

print f.name

018:2,S 021:2,S 052:2,S 061:2,SSeen

“Maildir/cur” directory

Seen043:2,SSeen Seen Seen

Page 17: Improving Server Applications with System Transactions

17

043:2,RReplied

Messages Hidden with Lockless Maildir

PROCESS 2 (MARKING)

if (access(“043:2,S”)):

rename(“043:2,S”, “043:2,R”)

PROCESS 1 (LISTING)

while (f = readdir(“Maildir/cur”)):

print f.name

018:2,S 021:2,S 052:2,S 061:2,SSeen

“Maildir/cur” directory

Seen043:2,SSeen Seen Seen

Process 1 Result018:2,S021:2,S052:2,S061:2,S

Message missing!

Page 18: Improving Server Applications with System Transactions

18

Return of The Coarse-grained File Locking Maildir synchronization

Lockless

File locks▪ Per-directory coarse-grained locking▪ Complexity of Maildir, performance of mbox

System transactions

“certain anomalous situations may result” – Courier IMAP manpage

Page 19: Improving Server Applications with System Transactions

Maildir Parallelized with System Transaction

PROCESS 1 (MARKING)

xbegin()

if (access(“XXX:2,S”)):

rename(“XXX:2,S”,

“XXX:2,R”)xend()

PROCESS 2 (MESSAGE LISTING)

xbegin()

while (f = readdir(“Maildir/cur”)):

print f.name

xend()

xbegin()

xend()

xbegin()

xend()

Consistent directory accesses with better parallelism

19

Page 20: Improving Server Applications with System Transactions

20

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 21: Improving Server Applications with System Transactions

21

Challenges of Rewriting Applications

1. Middleware state sharing

2. Deterministic parallel update for system state

3. Composing with other synchronization primitives

Page 22: Improving Server Applications with System Transactions

22

Middleware and System Transaction Problem with memory management

Multiple threads share the same heap

Thread 1 Thread 2In Transaction

xbegin();p1 = malloc();

xabort();p2 = malloc();

*p2 = 1;

Middleware (libc)

Kernel mmap()

Heap

Transactional object for heap

p1

Page 23: Improving Server Applications with System Transactions

23

Middleware and System Transaction Problem with memory management

Multiple threads share the same heap

Thread 1 Thread 2In Transaction

xbegin();p1 = malloc();

xabort();p2 = malloc();

*p2 = 1;

Middleware (libc)

Kernel

Transactional object for heap

Heap

FAULT!Certain middleware actions should not roll back

p1 p2unmapped

Page 24: Improving Server Applications with System Transactions

24

Two Types of Actions on Middleware State

USER-INITIATED ACTION User changes system

state Most file accesses Most synchronization

MIDDLEWARE-INITIATED System state changed

as side effect of user action malloc() memory mapping Java garbage collection Dynamic linking

Middleware state shared among user threads Can’t just roll back!

Page 25: Improving Server Applications with System Transactions

25

Handling Middleware-Initiated Actions Transaction pause/resume

Expose state changes by middleware-initiated actions to other threads

Additional system calls ▪ xpause(), xresume()

Limited complexity increase▪ We used pause/resume 8 times in glibc, 4 times in

JVM▪ Only used in application for debugging

Page 26: Improving Server Applications with System Transactions

26

Pause/Resume In JVM Execution

SysTransaction.begin();

files = dir.list();

SysTransaction.end();

Java code

xpause()

xresume()

xbegin();

files = dir.list();

VM operations(garbage collection)

xend();

JVM Execution

Page 27: Improving Server Applications with System Transactions

27

Other Challenges for Maturing TxOS 17,000 lines of kernel changes

Transactionalizing file descriptor table Handling page lock for disk I/O Memory protection Optimization with directory caching Reorganizing data structure and more

Details in the paper

Page 28: Improving Server Applications with System Transactions

28

Background: system transaction

System transactions in action

Challenges for rewriting applications

Implementation and evaluation

Outline

Page 29: Improving Server Applications with System Transactions

29

Application 1: Parallelized BFT Application

Implemented in UpRight BFT library

Fault tolerant routing backend Graph stored in a file Compute shortest path Edge add/remove

Ordered transactions for deterministic update

Page 30: Improving Server Applications with System Transactions

30

Minimal Application Code Change

Component Total LOC Changed LOC

Routing application

1,006 18 (1.8%)

Upright Library 22,767 174 (0.7%)

JVM 496,305 384 (0.0008%)

glibc 1,027,399 826 (0.0008%)

Page 31: Improving Server Applications with System Transactions

31

0 10 20 30 40 50 60 70 80 90 1000

5001000150020002500300035004000

TxOS, denseLinux,dense

Deterministic State Update with Better Throughput

Thro

ughp

ut (

req/

s)

Dense graph: 88%

tput

Sparse graph:

11% tput

Work to add/delete edges small compared to scheduling

overhead

Write ratio (%)BFT graph server

Page 32: Improving Server Applications with System Transactions

32

Application 2: Dovecot Maildir access Dovecot mail server

Uses directory lock files for maildir accesses

Locking is replaced with system transactions Changed LoC: 40 out of 138,723

Benchmark: Parallel IMAP clients Each client executes operations on a random

message▪ Read: message read▪ Write: message creation/deletion▪ 1500 messages total

Page 33: Improving Server Applications with System Transactions

33

Mailbox Consistency with Better Throughput

Dovecot benchmark with 4 clients

0102030405060708090

0 10 25 50 100Write ratio (%)

Tput

Impr

ovem

ent

(%)

Better block scheduling

enhances write performance

Page 34: Improving Server Applications with System Transactions

34

Conclusion: OS Transactions Improve Server Performance

System transactions parallelize tricky server applications Parallel Dovecot maildir operations Parallel BFT state update

System transaction improves throughput with few application changes Up to 88% throughput improvement At most 40 changed lines of application

code