institute of computing technology dma cache architecturally separate i/o data from cpu data for...
TRANSCRIPT
![Page 1: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/1.jpg)
INS
TIT
UTE O
F C
OM
PU
TIN
G
TEC
HN
OLO
GY
DMA Cache Architecturally Separate I/O Data from
CPU Data for Improving I/O Performance
Dang Tang, Yungang Bao,
Weiwu Hu, Mingyu Chen
2010.1
Institute of Computing Technology (ICT)
Chinese Academy of Sciences (CAS)
![Page 2: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/2.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
The role of I/O
I/O is ubiquitous Load binary files: Disk Memory Brower web, media stream: NetworkMemory…
I/O is significant Many commercial applications are I/O intensive:
Database etc.
![Page 3: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/3.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
State-of-the-Art I/O Technologies I/O Bus: 20GB/s
PCI-Express 2.0 HyperTransport 3.0 QuickPath Interconnect
I/O Devices SSD RAID: 1.2GB/s 10GE: 1.25GB/s Fusion-io: 8GB/s, 1M IOPS (2KB random 70/30 read/write mix)
![Page 4: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/4.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Direct Memory Access (DMA)
DMA is used for I/O operations in all modern computers
DMA allows I/O subsystems to access system memory independently of CPU.
Many I/O devices have DMA engines Including disk drive controllers, graphics
cards, network cards, sound cards and GPUs
![Page 5: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/5.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Outline
Revisiting I/O
DMA Cache Design
Evaluations
Conclusions
![Page 6: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/6.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Engine
CPU
Memory
Driver Buffer
Descriptor①
②③
Kernel Buffer
④
An Example of Disk Read:DMA Receiving Operation
• Cache Access Latency : ~20 Cycles• Memory Access Latency : ~200 Cycles
![Page 7: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/7.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Engine
CPU
Memory
Driver Buffer
Descriptor①
②③
Kernel Buffer
④
Direct Cache Access [Ram-ISCA05]
• This is a typical Shared-Cache Scheme
Prefetch-Hint Approach [Kumar-Micro07]
![Page 8: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/8.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Problems of Shared-Cache Scheme Cache Pollution Cache Thrashing
Not suitable for other I/O Degrade performance
when DMA requests are large (>100KB) for “Oracle + TPC-H” application
To address this problem deeply, we need to investigate the I/O data characteristics.
![Page 9: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/9.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
I/O Data V.S. CPU Data
MemCtrlI/O Data
CPU Data
HMTT
I/O Data + CPU Data
![Page 10: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/10.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
A short AD of HMTT [Bao-Sigmetrics08]
A Hardware/Software Hybrid Memory Trace Tool Can support DDR2 DIMM interface on multiple platforms Can collect full system off-chip memory traces Can provide trace with semantic information, e.g.,
virtual address Process id I/O operation
Can collect the trace of commercial applications, e.g., Oracle Web server
The HMTT System
![Page 11: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/11.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Characteristics of I/O Data(1)
% of Memory References to I/O data
% of References of various I/O types
![Page 12: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/12.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Characteristics of I/O Data(2) I/O request size distribution?
![Page 13: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/13.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Characteristics of I/O Data(3) Sequential access in I/O data
Compared with CPU data, I/O data is very regular
![Page 14: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/14.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Characteristics of I/O Data(4) Reuse Distance (RD)
LRU Stack Distance 1
3
2
4
1
2
2
3
3
4
4
3
1
1
2
1
2
4
3
1
2
3
4
1
2
3
1
2
1
2
3
1
1
2
4
RD
CDF
x%
<=n
![Page 15: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/15.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Characteristics of I/O Data(5)
DMA-W CPU-R
CPU-RW CPU-RW
CPU-W DMA-R
![Page 16: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/16.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Rethink I/O & DMA Operation
20~40% of memory references are for I/O data in I/O-intensive applications.
Characteristics of I/O data are different from CPU data An explicit produce-consume relationship for I/O data Reuse distance of I/O data is smaller than CPU data References to I/O data are primarily sequential
Separating I/O data and CPU data
![Page 17: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/17.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Separating I/O data and CPU data
Before Separating
After Separating
![Page 18: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/18.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Outline
Revisiting I/O
DMA Cache Design
Evaluations
Conclusions
![Page 19: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/19.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Cache Design Issues
Write Policy Cache Coherence Replacement Policy Prefetching
Dedicated DMA Cache (DDC)
![Page 20: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/20.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Cache Design Issues Adopt Write-Allocate Policy Both Write-Back or Write Through
policies are available Write Policy Cache Coherence Replacement Policy Prefetching
![Page 21: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/21.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Cache Design Issues
Write Policy Cache Coherence Replacement Policy Prefetching
IO-E
SI P
roto
col
for W
T p
olicy
IO-M
OE
SI P
roto
col
for W
B P
olicy
The only difference between IO-MOESE/IO-ESI and the original protocols is exchanging the local source and the probe source of state transitions
![Page 22: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/22.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
A Big Issue
How to prove the correctness of integrating the heterogeneous cache coherency protocols in a system?
![Page 23: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/23.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
A Global State Method for Heterogeneous Cache Coherence Protocol [Pong-SPAA93, Pong-JACM98]
DMA $ CPU $ CPU $
……O S IM I S
OS+I+ √ MS+I+ X
EI+
R|E
MI+W|*
S+I+R|I
![Page 24: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/24.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Global State Cache Coherence Theorem
Given N (N>1) well-defined cache protocols, they are not conflict if and only if there does not exist any Conflict Global States in the global state transition machine.
S+I+
EI+
I+
MI+
OS+I+
R|*
W|*
W|* R|I
R|M W|*
R|*
R|*
W|*
W|*
R|E
R|I
5 Global States:
S+I+
EI*
I*
MI*
OS*I*
√√√√√
![Page 25: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/25.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
MOESI + ESI
S+I+
ECI+
I+
MCI+
EDI+
OCS+I+
R*|*
RC|E R*|I
WC|* WD|*
RC|I RD |I
WD|I
RD|* WD|*
RC|I
WC|*
Wc|I
WD|I
WC|I
WD|SI R*|I
WC|*
RC|* RD|SI
WD|* RD|E RC|M
WC|*
6 Global States:
S+I+
ECI*
I*
MCI*
EDI*
OCS*I*
√√√√√√
![Page 26: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/26.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Cache Design Issues
Write Policy Cache Coherence Replacement Policy Prefetching
An LRU-like Replace Policy
1. Invalid
2. Shared
3. Owned
4. Exlusive
5. Modified
![Page 27: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/27.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Cache Design Issues
Write Policy Cache Coherence Replacement Policy Prefetching
Adopt straightforward sequential prefetching Prefetching trigged by cache miss Fetch 4 blocks one time
![Page 28: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/28.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Design Complexity vs.Design Cost Dedicated DMA Cache (DDC)
Partition-Based DMA Cache
(PBDC)
![Page 29: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/29.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Outline
Revisiting I/O
DMA Cache Design
Evaluations
Conclusions
![Page 30: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/30.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Speedup of Dedicated DMA Cache
![Page 31: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/31.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
% of Valid Prefetched Blocks
DMA caches can exhibit an impressive high prefetching accuracy This is because I/O data has very regular access pattern.
![Page 32: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/32.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Performance Comparisons
Although PBDC does not additional on-chip storage, it can achieve about 80% of DDC’s performance improvements.
![Page 33: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/33.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Outline
Revisiting I/O
DMA Cache Design
Evaluations
Conclusions
![Page 34: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/34.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Conclusions We have proposed a DMA cache technique to separate
I/O data and CPU We adopt a Global State Method for Integrating
Heterogeneous Cache Protocols Experimental results show that DMA Cache schemes are
better than the existing approaches that use unified, shared caches for I/O data and CPU data
Still Open Problems, e.g., Can I/O data goes direct to L1 cache? How to design heterogeneous caches for different
types of data? How to optimize MC with awareness of IO
![Page 35: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/35.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGYThanks!&
Question?
![Page 36: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/36.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
RTL Emulation Platform LLC and DMA cache Model from Loongson-2F DDR2 Memory Controller from Loongson-2F DDR2 DIMM model from Micron Technology
LL Cache
MemCtrl
DDR2 DIMM
DMA Cache
Memory trace
![Page 37: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/37.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Parameters
DDR2-666
![Page 38: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/38.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Normalized Speedup for WB
Baseline is snoop cache scheme DMA cache schemes exhibits better performance than others
![Page 39: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/39.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
DMA Write & CPU Read Hit Rate
Both shared cache and DMA cache exhibit high hit rates Then, where do cycle go for shared cache scheme?
![Page 40: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/40.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Breakdown of Normalized Total Cycles
![Page 41: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/41.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
Design Complexity of PBDC
![Page 42: INSTITUTE OF COMPUTING TECHNOLOGY DMA Cache Architecturally Separate I/O Data from CPU Data for Improving I/O Performance Dang Tang, Yungang Bao, Weiwu](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649e175503460f94b02b83/html5/thumbnails/42.jpg)
INSTITUTE OF COMPUTING
TECHNOLOGY
More References on Cache Coherence Protocol Verification
Fong Pong , Michel Dubois, Formal verification of complex coherence protocols using symbolic state models, Journal of the ACM (JACM), v.45 n.4, p.557-587, July 1998
Fong Pong , Michel Dubois, Verification techniques for cache coherence protocols, ACM Computing Surveys (CSUR), v.29 n.1, p.82-126, March 1997