memory system characterization of commercial workloads authors: luiz andré barroso (google, dec;...

8
MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC; worked on Dash and Flash) Edouard Bugnion (one of the original founders of VMware; also worked on SimOS) Presented by: David Eitel, March 31, 2010

Upload: helen-gibson

Post on 18-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS

Authors:Luiz André Barroso (Google, DEC; worked on Piranha)Kourosh Gharachorloo (Compaq, DEC; worked on Dash and Flash)Edouard Bugnion (one of the original founders of VMware; also worked on SimOS)

Presented by: David Eitel, March 31, 2010

Page 2: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Types of Commercial Applications Online Transaction Processing (OLTP) Decision Support Systems (DSS) Web Index Search (WIS)

Source: S. Brin and L. Page. “The Anatomy of a Large-Scale Hypertextual Web Search Engine.”

Page 3: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Benchmarks

Oracle Database Engine TPC-B Banking Benchmark for OLTP TPC-D Benchmark for DSS (read-only

queries) AltaVista

Sources: http://georgiaconsortium.org/images/Banking-Coins.jpg,http://greencanada.files.wordpress.com/2009/04/databases.jpg, http://sixrevisions.com/web_design/popular-search-engines-in-the-90s-then-and-now/

Page 4: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Monitoring Results

Source: Fig. 4

OLTP has more complex queries than DSS/AV Important to have low-latency to non-primary caches

because OLTP working set is very large. Cache misses for DSS are low – misses on large

database tables.

Big CPI!

Lots of Bcachemisses

Breakdown of the execution time

misses

Sum of single- and dual-issue cycles

Pipeline and address translation related stalls

>75%memstalls

Scache = secondary cacheBcache = board-level cache

Page 5: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Simulation Results for OLTP

Source: Fig. 5

Associativity

Cache Size

Data capacity/Conflict misses

INST = instruction executionCACHE = stalls within cache hierarchyMEM = memory system stalls

Idle time increases with bigger caches.

The I/O latency cannot be hidden with faster processing rates.

Faster processing rates with a more efficient memory system = more commits ready for the log writer (I/O).

OLTP benefits from larger Bcaches.

Page 6: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

More Simulation Results (OLTP and DSS)

DSS works well with current sized caches because the working sets are small (few misses in on-chip caches)

Replacement/instr miss rate are not affected by line size good for larger cache sizes.

False sharing increases with cache line size.

What would be different if increased latency and bandwidth were accounted for when line size increases?

Are the results NOT valid because

size(database) = size(main memory)?Sources: Fig. 7 and Fig. 8

Page 7: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Important Things to Remember As # processors increases, communication stalls

increase (see Fig. 6) O/S activity & I/O latencies do not greatly affect

the behavior of database engines. OLTP has instruction & data locality helped by

off-chip caches DSS and WIS have working sets that fit in

memory sensitive to on-chip caches

Source: http://www.stress-treatment-21.com/wp-content/uploads/2009/05/thinking-monkey.bmp

Page 8: MEMORY SYSTEM CHARACTERIZATION OF COMMERCIAL WORKLOADS Authors: Luiz André Barroso (Google, DEC; worked on Piranha) Kourosh Gharachorloo (Compaq, DEC;

Discussion Questions

What are some new commercial applications that have developed since this paper was written?

How much have the issues in this paper been addressed in recent architecture designs?

What should we focus on in the “parallel” future to increase performance for commercial applications?

Could we change commercial workloads to function more like scientific workloads to obtain performance gains?

Source: http://www.vosibilities.com/wp-content/uploads/2009/05/bpm-questions-you-should-ask-your-bpms-vendor1.jpg