ds5000 performance
TRANSCRIPT
-
7/29/2019 DS5000 Performance
1/18
Copyright IBM Corp. 2009. All rights reserved. ibm.com/redbooks 1
Redpaper
Explaining DS5000 Performance
Introduction
This IBM Redpapers publication documents measurements of DS5000 performance. Themeasurements were done using a DS5000 and peripheral server equipment in the lab in
Gaithersburg, Maryland. This document was created by Siebo Friesenborg of Storage ATS forthe Americas Group. Direct questions to Siebo Friesenborg by e-mail at [email protected] by phone at 522-799-5894.
The purpose of the project is to understand the basic performance of the DS5000 anddocument it.
We believe that the most important way to disseminate disk measurements is through DiskMagic. Almost all of the measurements taken during this project were for a single type of
I/O operation. For example, the first measurement in this paper is for read misses. Readmisses are read operations that are not resolved in the cache. We have never seen a
production workload that is entirely read misses. The measurement is taken is to allowextrapolation to situations where read misses are less than 100% of the entire workload.
Readers might extrapolate by thinking that if you find 20% read hits, you can run more work.
Think of Disk Magic as an extrapolator. It takes the base measurements in this paper (inaddition to others specifically taken for Disk Magic) and incorporates them in a model that
allows you to estimate performance for workloads that were not measured. Disk Magic is atool made available to IBMers and IBM Business Partners.
Equipment
The configuration used naturally included as much disk, fiber, and control unit performance as
was possible at the time the measurements were taken. You cannot learn what the maximumperformance is unless you have the fastest maximum allowed. The DS5300 configuration had16 GBytes of cache storage, 16 Fibre Channel (FC) channels capable of 4 GBits each, and
256 hard disk drives (HDDs) attached to 16 EXP810 expansion drawers connected through a4 GBit fiber. The disks were Fibre Channel and spun at 15K RPM. No Serial Advanced
Technology Attachment (SATA) was measured in this project.
Alex Osuna
Siebo Friesenborg
http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/http://www.redbooks.ibm.com/ -
7/29/2019 DS5000 Performance
2/18
2 Explaining DS5000 Performance
Connections between the DS5300 and the 16 host channels running at 4 GBit each were
made through Cisco switches according to the best practices information in IBM Redbookspublication DS5000 Disk Storage Subsystem Architecture, Implementation, and Usage,SG24-7676-00. Eleven of the 32 engines available on the P595 were actually used.
Finally, two arrays, each eight drives, were allocated on each drawer. The content of the arraydepended on the RAID used, as seen in Table 1.
Table 1 Array Content
There is a best practice to allocate volumes so that only one HDD per array exists on anyEXP810 or EXP5000 drawer. There is no difference in the performance compared to
allocating all drives of an array to being in the same drawer. Since this was a performanceproject and it would involve generating measurements for 2, 4, 6, 12, 14, 16 drawers, the
best practice was put aside.
Measurement results
In this section we review some details about the measurements.
An I/O driver is a program that is only interested in reading and writing data as was described
in some sort of input language. Since it is not concerned with data relations or datareconstruction, an I/O driver can probably put a lot more load on a DS5300 than a Database
Management System (DBMS) for a given amount of server utilization. There is also noapplication logic (adding to yearly, W2 deductions, decreasing inventory, checking customer
balances, and so on), so it can probably cause more load than applications. It does not doany checking for valid data or processing that is dependent on input data, so using an I/O
driver is much easier to manage than real applications. An I/O driver offers great benefits fordoing a measurement process.
However, we must understand that the measurements produced are almost certainly higher
than what could be expected in a production environment. As an example, the 16-drawermeasurement of read misses had 32 hdisks each running at 98.8% busy with I/O rates
between 2105.9 and 2121.8 reads per second. It is highly improbable that a productionworkload could ever cause such well-balanced measurements.
The I/O driver used for these measurements is named Performance Assessment Workload
Suite (PAWS). It is a suite of programs that IBM Storage Performance in Tucson has used formany years. With PAWS, if a user encounters a task that is very difficult to do, PAWS can be
changed so that he does no longer must do it. This is very effective for the lab measurementenvironment.
One of the salient features of PAWS is that it manages the hit percentages. If yourspecification is to have a 60% hit ratio, PAWS monitors the number of hits and misses and
RAID Content
RAID-1 4x4
Four mirrored pairs
RAID-5 7D+P
Seven data drives plus a drives worth of
parity
RAID-6 6D+P+Q
Six data drives plus two drives worth of parity
-
7/29/2019 DS5000 Performance
3/18
Explaining DS5000 Performance3
changes reference patterns so that you get what was specified. It the read hits are at 50%,
PAWS will read data from a smaller number of locations so that the hit percentage goes up. Ifthe hits are at 70% it will read from a larger set of locations.
This is very useful, but it also means that you cannot determine what hit percentage you will
get if you change the amount of cache storage available. Regardless of the amount of cachestorage that you install, the hit percentage that you receive will be what you specified toPAWS.
Finally, these numbers represent the best measurements that we can get. This paper notes
options that you may not choose for non-performance reasons. For instance, cache mirroringhas a large impact on large block sequential write performance. Performance with and without
cache mirroring enabled is documented for sequential writes.
Read miss operations
The chart in Figure 1 shows the response time.
Figure 1 DS5000 read miss operations versus HDDs
After the DS5300 went to customer benchmark activity, we found that the 2-drawermeasurements were invalid.
The improvement from 14 to 16 drawers is about the same as what we received from adding
two drawers to the mix at other levels. This indicates that the critical resource is HDDs and weshould be able to improve performance if we use more HDDs (or faster HDDs).
Sixteen drawers of 15K RPM drives spin a total of 64,000 rotations a second. Compare that to
the 58,000 operations per second measured with 20-millisecond response times. The seekoptimization algorithm is working. The chart is cut off at a 20-millisecond response time.
Another run did 69,000 read operations with 59.73 MS per operation. That is 1.08 read
operations per rotation.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 5000 10000 15000 20000 25000 30000 35000 40000 45000 50000 55000 60000
Operations / second
Milliseconds/IO 4 Drawer
6 Drawer
8 Drawer
10 Drawer
12 Drawer
14 Drawer
16 Drawer
16xDS4800
http://-/?-http://-/?- -
7/29/2019 DS5000 Performance
4/18
4 Explaining DS5000 Performance
The dotted line shows the performance obtained (in 2005) from the DS4800. It looks like
there are 46,000 reads on the DS4800 and 58,000 operations on the DS5000 (about a 20%increase). That is an incredible improvement given that the devices on the DS4800 rotate andseek at the same speed as the devices on the DS5300.
Figure 2 shows a graph of what happened in the 16-drawer runs. Note that:
There is not much difference in the performance of RAID-1, RAID-5, and RAID-6 doingrandom read miss operations. After you look at this data, you may be inclined to wonder
why random read times would change due to RAID.
The dotted line with square markers shows the DS4800 measurements. The dotted linewith circle markers shows if what you get if you multiply the DS4800 read rate by 1.14 (256
HDDs divided by 224 HDDs). This is an estimate of what a 256 drive DS4800 would do.There are other factors that improve random read times on the DS5300.
Figure 2 Sixteen-drawer DS5000 read miss operations
Read miss observations
Figure 2 on page 4 shows the improvement that the DS5300 made by allowing the
attachment of more HDDs. The DS4800 measurements documented in 2005 were doneusing EXP700 expansion units. This limited the transfer to 2, rather than 4 GBit/second.
Dividing the 4 K block size by 200 MByte per millisecond estimates the improvement to be0.02 milliseconds per block.
The DS5300 control units are faster than the DS4800. Therefore, the time initiating and
completing requests is reduced.
HDD vendors are constantly improving their products. This improves all of the products thatuse that HDD.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 6000 12000 18000 24000 30000 36000 42000 48000 54000 60000
Operations / second
Milliseconds/I
RAID-6
RAID-5
RAID-1
DS4800 R5
DS4800 256
Note: DS4800 256 is an approximation of what a 256 HDD DS4800 would do
http://-/?-http://-/?-http://-/?-http://-/?- -
7/29/2019 DS5000 Performance
5/18
Explaining DS5000 Performance5
More importantly, the addition of drives to the DS5300 causes read miss performance
improvements from the minimum through maximum number of HDDs allowed. There isheadroom in the DS5300 that could be used to accommodate more or faster HDDs.
Write miss operations
These figures may seem confusing. What is shown in Figure 3 is that write miss responsetimes are very low regardless or rate until cache write storage is filled. Then all of the activity
goes to HDDs where a number of milliseconds are required rather than the microsecondsrequired to transfer between storage on the server and the DS5300. Any additional workload
results in higher response time with no additional work being done. Note that the amount ofcache write storage available does not make a lot of difference. Once the write miss rate
exceeds the capability of the HDDs, you will run out of data. The only question is how long itwill take.
Figure 3 DS5000 RAID-5 write miss operations versus HDDs
Again, we see that adding HDDs to the configuration results in performance increases
throughout the range of supported configurations. There is an ability to accommodate more orfaster HDDs.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000
Operations / second
Milliseconds/I
2 Drawer
4 Drawer
6 Drawer
8 Drawer
10 Drawer
12 Drawer
14 Drawer
16 Drawer
16xDS4800
http://-/?-http://-/?- -
7/29/2019 DS5000 Performance
6/18
6 Explaining DS5000 Performance
Sixteen-drawer write miss operations
We see in Figure 4 that the DS5300 has more of an advantage over the DS4800 whenrunning random write misses than the advantage measured for read miss operations. This isbecause of the faster, dedicated fabric used for mirroring write data in cache. On DS4800, the
data was mirrored by using the same back-end fabric as was used to send data to the HDDs.
On the DS5300, the fabric is dedicated to controller communication and the speed of thefabric is 17 Gigabytes (GBytes) per second.
Figure 4 DS5000 16 drawer write miss operations
While the DS5300 could do 58,000 instead of 48,000 read misses per second (1.21 times theDS4800), it can do 14,000 rather than 9,000 write miss operations (1.56 times the DS4800).
High write content causes more DS5300 performance advantage.
Write miss operations
Figure 3 on page 5 shows that there is performance improvement all through the range of
HDDs that can be attached. The DS5300 is faster doing writes than the DS4800 for the samereasons that it was faster doing reads:
More devices Faster control units Faster HDDs Faster fabric
In addition, there is faster mirroring because of the faster PCI-extended bus, which is
dedicated to the cache mirroring task.
It is important to understand random write misses. Different RAID and HDD hardware doesnot really make any difference in performance until cache write storage cannot be freedquickly enough to match the request rate. Then it takes milliseconds rather than
microseconds, and any increase in request rate goes straight into queue time.
Performance is equivalent until about 9,000 DS4800 write miss operations. Then the DS4800
cannot destage data quickly enough. At about 10,000 write miss operations, the RAID-6
0
2
4
6
8
10
12
14
16
18
20
0 2500 5000 7500 10000 12500 15000 17500 20000 22500 25000
Operations / second
Milliseconds/I
RAID-6
RAID-5
DS4800 R5
http://-/?-http://-/?- -
7/29/2019 DS5000 Performance
7/18
Explaining DS5000 Performance7
availability benefit finally impacts performance. RAID-5 can take us up to nearly 15,000 writes
per second. This agrees fairly well with the idea that it takes six HDD operations to do aRAID-6 miss and four to implement a RAID-5 miss.
DS5000 read or write hit operations
This section covers cache hit operations (see Figure 5).
Figure 5 DS5000 read/write hit workloads
Read hit measurement
So far, we have discussed operations in which all data is transferred to or from the HDDs. Labmeasurements were also made where operations caused no activity on the HDDs. The readhit measurements for 512-byte and 4096-byte records show N/A (not applicable) for the I/O
per second and Mbytes per second columns. This is because the (IOSTAT) measurement ofCPU utilization of the system driving the I/O requests showed 100%. Remember that PAWS is
designed to cause I/O operations with minimal use of the server.
The larger block sizes (64 and 512 K) resulted in transfer rates near the theoretical limits ofthe server channels (see Figure 5) (that is, 4 GBit channels run at up to 400 MBytes per
second). Sixteen channels should be able to do 6,400 Mbytes per second. Getting values like6,271 or 6,461 of a theoretical limit of 6,400 suggests that a DS5300 should be able to get
more performance out of faster (or more) channels.
Write hit measurement
The write hit measurements of 64 K and 512 K transfers show that write operations can run
near rated speed of the fabric between the server and DS5300 (see Figure 5). If we find lessperformance for sequential operations, it will be because of the back-end fabric. The circuitry
for mirroring across the two controllers of the DS5300 can do over 90 K operations a second(or less than 11 microseconds per transfer).
Read Hit measurement
CPU% (below) is from IOSTAT measurement of the11 processor P595
32.36,46112,323512K
50.56,27195,69264K
100.0N/AN/A4096
100.0N/AN/A512
CPU%MB/SecondIO/SecondBytes/read
Write Hit measurement
6,32312,060512K
4,95475,39464K
37190,4824096
MB/SecondIO/SecondBytes/write
-
7/29/2019 DS5000 Performance
8/18
8 Explaining DS5000 Performance
Summary of cache hit operations
One hundred percent cache read or write hits are not probable. If you hear performancenumbers that sound too good to be true, it is probably accurate measurement of a productrunning at 100% hits. For people using PAWS, it is easy to get 100% cache hits, Just specify
it as a parameter. People running production work have a much harder time. This may be
because the server configuration could not generate enough I/O requests or because arealistic reference pattern did not fit in the cache storage available.
On Line Transaction Processing (OLTP) workload
All of the measurements so far are valuable measurements (especially for Disk Magic)because it is relatively easy to describe the environment and explain the results. They also
quantify maximum possible benefits and liabilities of making a configuration change.
A traditional lab measurement in Tucson is the OLTP workload. It consists of 70% reads, 50%
read hits, and 33% write hits (you might hear it referred to as the 70-30-50 workload. It isintended for those people who asks how the xxx performs in a normal workload, so we
discuss OLTP numbers. This information as to how well different operations are combined isinvaluable to the Disk Magic people.
If you do 100 OLTP requests, 70 will be reads: 35 resolved in cache storage, 35 read from
HDDs. Thirty requests will be writes. Then of the writes will replace data that was writtenpreviously but that has not yet been written to HDDs. The data from the previous request will
be replaced and never cause HDD activity. Twenty operations will go to the HDDs. Theamount of HDD activity required will vary depending on the RAID used, as shown in Table 2.
Table 2 HDD operations
The 20 RAID-1 write misses generate 40 HDD writes. Add that to the 35 read miss operationsand we find that 100 OLTP operations result in 75 operations on the HDDs, where 53% of
those were caused by write misses. RAID-5 and RAID-6 require more operations. The net isthat write operations cause 70 and 77% of the total HDD activity for RAID-5 and RAID-6,
respectively.
Read/write OPS Hits Misses RAID-1 RAID-5 RAID-6
Read 70 35 35 35 35 35
Write 30 10 20 40 80 120
Total 100 45 55 75 115 155
HDD% caused by writes 53% 70% 77%
-
7/29/2019 DS5000 Performance
9/18
Explaining DS5000 Performance9
In Figure 6 the near-vertical finish looks a lot like the 100% write miss workload. As explained
in Figure 5 on page 7, writes dominate the workload. The lower I/O rates look more like readmisses with contention causing response time increases. The read and write hits support ahigher I/O rate than was possible on the 100% miss measurements.
Figure 6 DS5000 OLTP (70-30-50) by drawers
The DS4800 performance was less than what 10 drawers of DS5300 could generate. Weshould be aggressive replacing DS4800 when there is a lot of write activity.
Most importantly, adding HDDs improves performance throughout and the range of valid
configurations. There is headroom to incorporate more or faster HDDs.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 8 000 16 000 2 400 0 32 000 400 00 4 8000 560 00 6 400 0 7 200 0 800 00
Operations / second
Milliseconds/I 2 Draw ers
4 Draw ers
6 Draw ers
10 Drawers
12 Drawers
14 Drawers
16 Drawers
DS4800 R5
-
7/29/2019 DS5000 Performance
10/18
10 Explaining DS5000 Performance
OLTP considerations
Figure 7 gives us the following observations:
RAID-6 on the DS5300 runs about the same as RAID-5 on the DS4800.
RAID-5 on the DS5300 is about 1/3 faster than DS5300 RAID-6 or RAID-5 on the
DS4800.
Random writes dominate this workload. RAID-1 performance is about the same (5 to 6milliseconds per request) as DS5300/RAID-5 up to 42,000 operations per second, then
the reduced contention on the HDDs takes effect.
Figure 7 DS5000 OLTP (70-30-50) for 16 drawers
Sequential performance
Most of the measurements shown so far are concerned with random processing requestswith a relatively small number of bytes per operation (4 K). A 4 GBit/second channel istransferring 4,096 bytes of data in .01 milliseconds. Comparing that to 3.7 MS for an average
seek or 2 MS for half a rotation on a 15 K RPM disk, we find that we should expect thatchannels are not a large factor. The maximum rate of the OLTP workload was 73,000
operations per second. At 4 K per block, that is 292,000 KBytes/second or 292Mbytes/second. A single 4 GBit channel can probably handle such a small load.
Large block sequential processing is mostly about the speed of the fabric between the
storage subsystem and either the server or the disks.
Sixty-four K transfers should take 0.16 MS and 512 K transfers should take1.28 MS. What is
more, we are probably not seeking great distances and the channels are more effectivebecause initiating and ending requests are less frequent events. Large block sequentialprocessing is very different from the random operations discussed so far.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 80 00 160 00 2 400 0 32 000 4 000 0 4 80 00 5 60 00 6 400 0 72 000 80 000
Operations / second
Milliseconds
/I
RAID-6
RAID-5
DS4800 R5
RAID-1
DS4800 R1
-
7/29/2019 DS5000 Performance
11/18
Explaining DS5000 Performance11
Configuration
In all the measurements taken an array consisted of eight HDDs (spread over two drawers).The content of the drawers was is shown in Table 3.
Table 3 Contents of drawers
Only two of the four arrays in a pair of drawers were used for the sequential runs. Three runswere made using 2, 4, 6, 12, 14, 16 drawers. The types of runs are discussed below.
Sequential reads
This is the type of access that is found in applications handling ad hoc database requests(that is, requests that do full table scans because there is no indexing available). Another
application for sequential reads would be viewing videos as a service. Video editing woulduse a combination of sequential reads and writes.
Sequential writes
Reading backup, archive, or surveillance data indicates data recovery, an audit, or a crime,respectively. One hopes that those applications are dominated by sequential writes.
Cache mirror disabled
Cache mirroring was found to be a very large performance factor on DS4000 versions.Actually, it was a large factor on the FAStT products that preceded DS4000. Basically,
sequential could run two or three times as fast if cache mirroring was disabled. However,cache became a single point of failure if mirroring was disabled.
Basically, we were trying to see whether we could get 400 Mbytes per channel using eightdrives. We wanted to end up with 6,400 Mbytes out of 16 channels and 128 drives.
RAID Configuration Comments
RAID-1 4x4 Four mirrored pairs. Some people call this RAID-10.
RAID-5 7D+P Seven drives worth of data plus a drives worth of parity.
RAID-6 6D+P+Q Six drives worth of data plus two drives worth of parity.
-
7/29/2019 DS5000 Performance
12/18
12 Explaining DS5000 Performance
DS5000 sequential read operations
The top chart in Figure 8 shows results when the transfer was 64 KBytes per operation. Thearray segment size was 512 Kbytes for all of the measurements described in this paper. Sowith a 64 K block size, it takes eight reads to read an entire segment. The bottom chart in
Figure 8 shows the (slightly better) measurements achieved when a single read operation
asked for an entire array segment. The biggest number is 6423. We probably cannot get morethan that with twice the drives.
Figure 8 DS5000 sequential reads
2008 IBM Corporation
0
1 000
2 000
3 000
4 000
5 000
6 000
7 000
2 4 6 8 10 1 2 14 1 6
M
Bp
erSeco
R 1
R 5
R 6
D S4 8 0 0
Drawers
64K Blocksize
0
1000
2000
3000
4000
5000
6000
7000
2 4 6 8 10 1 2 1 4 1 6
M
Bp
erSeco
R 1
R 5
R 6
Drawers
512K Blocksize
-
7/29/2019 DS5000 Performance
13/18
Explaining DS5000 Performance13
Sequential write operationsIn Figure 9 we show the measurements at 64 and 512 Kilobytes per operation. Writing datawith cache mirroring enabled is significantly slower than reading. For example, RAID-5 reads512 K blocks at 6423 MByte/second and writes at 3760 MByte/second. Table 4 shows the
inputs to Figure 9.
Figure 9 DS5000 sequential writes
Table 4 Inputs toFigure 9
The DS4800 had 2 GBit fabric and half as many channels as the DS5000. Also, the data buson the DS4800 was only capable of a nominal 1,600 MByte per second. It had pretty good
performance in 2005.
RAID-1 is an excellent way to write small random blocks. It has to write much more sequentialdata than RAID-5 or RAID-6 arrays doing stride writes. While RAID-1 is an excellent best
practice for some uses, it is not a best practice for sequential performance.
64 K 512 K
RAID Read Write Read Write
RAID-1 2964 1764 2703 1918
RAID-5 5812 3459 6423 3760
RAID-6 4850 3106 6135 3376
DS4800 1400 358
2008 IBM Corporation
0
1 000
2 000
3 000
4 000
5 000
6 000
2 4 6 8 10 1 2 14 1 6
M
Bp
erSeco
R A I D -1
R A I D -5
R A I D -6
D S 4 8 0 0
Drawers
64K Blocksize
0
1000
2000
3000
4000
5000
6000
2 4 6 8 10 12 14 16
MBp
erSecon
d
RAID -1
RAID -5
RAID -6
Drawers
512K Blocksize
http://-/?-http://-/?-http://-/?-http://-/?-http://-/?-http://-/?- -
7/29/2019 DS5000 Performance
14/18
14 Explaining DS5000 Performance
Figure 10 shows that if we disable cache mirroring, the activity looks a lot more like sequential
reads in terms of transfer rates.
Figure 10 DS5000 sequential writes with cache mirroring disabled
For the 64 K block size run, the numbers are shown in Table 5.
Table 5 64 K block size run
The installation that does a massive amount of sequential writes must consider the trade-offbetween running RAID-5 1.68 (5812/3459) times as fast and creating a single point of failure.
Those installations saving surveillance data or video images will be inconvenienced and havecompleteness (like gaps on tapes). It will not be the same as making invalid stock market
quotes and having to make up any losses. Some installations might have schemes that canuse a disaster recovery site to make the data good. For those installations doing massive
large block sequential operations, it is still a point of interest. This is not nearly the burningquestion that faced DS4800 installations, where the cache mirroring disabled runs are 2.96times as fast. Dedicated fabric and a much faster bus for write operations makes the impact
much less on the DS5000.
RAID Enabled Disabled (units are
Mbytes per second)
RAID-1 1764 2964
RAID-5 3459 5812
RAID-6 3106 4850
DS4800 358 1062
2008 IBM Corporation
0
1000
2000
3000
4000
5000
6000
7000
2 4 6 8 10 12 14 16
MB
perSecond
RAID-1
RAID-5
RAID-6
DS4800
Drawers
64K Blocksize
0
1000
2000
3000
4000
5000
6000
7000
2 4 6 8 10 1 2 14 1 6
M
B
perS
ec
R A I D -1
R A I D -5
R A I D -6
Drawers
512K Blocksize
-
7/29/2019 DS5000 Performance
15/18
Explaining DS5000 Performance15
Figure 11 shows us that a decision about cache mirroring is not nearly so hard for random
write misses. We are transferring 4 K blocks. The performance penalty due to twice as many0.01 millisecond transfers is hard to measure in Figure 10 on page 14. If it is hard to measurethe difference, take the availability of cache write mirroring, which is much different from large
block sequential.
Figure 11 DS5000 RAID-5/6 enable/disable cache mirroring
Sequential summary
If we are to get any better sequential processing, we need more channels or faster channels.
We do not expect normal installations to put in the effort required to reach system maximumsas shown here, but some installations are so dominated by sequential processing that it ispossible (and financially worthwhile) to schedule workload and fully use equipment.
Those installations doing more general data transfer will occasionally notice that a backup orarchive run went significantly faster on the DS5000 than it did on the predecessor.
RAID-5 is slightly faster than RAID-6 because less parity is written for the same amount ofdata. In the configuration used in the lab, seven out of eight array segments in a stride
contained RAID-5 data. For RAID-6, six out of eight segments in a stride contain data.Naturally, the more devices per stride, the less the RAID-5 advantage (ending up with 28
versus 29 data segments per 30 HDD strides.
Summary
Sequential performance is outstanding. To get significantly better system measurements youmust have more channels or faster channels. Remember that the measurements were from
128 devices running at 15,000 revolutions per minute. It is interesting to speculate about whatwould happen if we used 256 devices spinning at 7,200 revolutions per minute. While wecannot run faster than 6,400 MBit per second, it does not appear that we will run much
slower.
2008 IBM Corporation
0
2
4
6
8
10
12
14
16
18
20
0 2000 4000 6000 8000 10000 12000 14000 16000
Operations / second
Milliseconds/I
RAID-6D
RAID-6
RAID-5D
RAID-5
-
7/29/2019 DS5000 Performance
16/18
16 Explaining DS5000 Performance
Random performance is subtly better. Achieving (57000/48000=) 1.19 times as many
requests on a DS5000 as on a DS4800 was not obvious.
Small improvements due to channel speed HDD improvements and a little bit of transfer andprotocol mounted up to 19%. Writes got an additional boost from the much faster bus. Using
independent links to handle cache write mirroring rather than sharing the channels going tothe disks was also a benefit. The OLTP workload looked like a combination of the read andwrite misses with a very pleasing 2.13 times as many operations. Cache hit operations really
improved the ratio.
In every measurement run (except read and write hit operations) processing was limited bythe speed or number of HDDs or channels. There clearly is headroom in the DS5000 so that
improvements in device or channel attachment can be incorporated.
The team that wrote this IBM Redpapers publication
This paper was produced by a team of specialists from around the world working at theInternational Technical Support Organization, Austin Center.
Alex Osuna is a Project Leader at the International Technical Support Organization, Austin
Center. He writes extensively and teaches IBM classes worldwide on all areas of storage.Before joining the ITSO four years ago, Alex worked as a Principal Systems Engineer in
Tivoli storage. He has 30 years of experience in the IT industry and holds certifications fromIBM, RedHat, Microsoft, and the Open Systems Group.
Siebo Friesenborg is an Advanced Technical Support (ATS) representative for Storage inAmericas Group. Siebo has been writing technical bulletins, white papers, and IBM product
manuals about disk performance for more than 25 years. He holds a degree in mechanicalengineering from the University of Delaware (class of 1965) and has been with IBM since
then. His career with IBM has taken him from the Wilmington, Delaware, sales office toPhiladelphia, Gaithersburg, Belgium, Dallas, and Tucson.
Thanks to the following people for their contributions to this project:
Bruce AllworthAdvisory IT Specialist - Storage - Open Systems - Advanced Technical Support (ATS),Americas
Al WatsonSenior IT Specialist, Storage - Open Systems - Advanced Technical Support (ATS), Americas
Gene CullumCertified Consulting IT Specialist - Storage - Disk - Advanced Technical Support (ATS),
Americas
Kevin CummingsCert I/T Specialist - Storage - Performance Benchmarking - Advanced Technical Support
(ATS), Americas
-
7/29/2019 DS5000 Performance
17/18
Copyright International Business Machines Corporation 2009. All rights reserved.
Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by
GSA ADP Schedule Contract with IBM Corp. 17
Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in other countries. Consultyour local IBM representative for information on the products and services currently available in your area. Anyreference to an IBM product, program, or service is not intended to state or imply that only that IBM product,program, or service may be used. Any functionally equivalent product, program, or service that does notinfringe any IBM intellectual property right may be used instead. However, it is the user's responsibility toevaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. Thefurnishing of this document does not give you any license to these patents. You can send license inquiries, inwriting, to:IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A.
The following paragraph does not apply to the United Kingdom or any other country where suchprovisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATIONPROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR
IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT,MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer ofexpress or implied warranties in cer tain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors. Changes are periodically madeto the information herein; these changes will be incorporated in new editions of the publication. IBM may makeimprovements and/or changes in the product(s) and/or the program(s) described in this publication at any timewithout notice.
Any references in this information to non-IBM Web sites are provided for convenience only and do not in anymanner serve as an endorsement of those Web sites. The materials at those Web sites are not part of thematerials for this IBM product and use of those Web sites is at your own risk.
IBM may use or distribute any of the information you supply in any way it believes appropriate without incurringany obligation to you.
Information concerning non-IBM products was obtained from the suppliers of those products, their publishedannouncements or other publicly available sources. IBM has not tested those products and cannot confirm theaccuracy of performance, compatibility or any other claims related to non-IBM products. Questions on thecapabilities of non-IBM products should be addressed to the suppliers of those products.
This information contains examples of data and reports used in daily business operations. To illustrate themas completely as possible, the examples include the names of individuals, companies, brands, and products.All of these names are fictitious and any similarity to the names and addresses used by an actual businessenterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programmingtechniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing applicationprograms conforming to the application programming interface for the operating platform for which the sampleprograms are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,cannot guarantee or imply reliability, serviceability, or function of these programs.
-
7/29/2019 DS5000 Performance
18/18
Redpaper
This document REDP-4498-00 was created or updated on February 24, 2009.
Send us your comments in one of the following ways: Use the online Contact us review Redbooks form found at:
ibm.com/redbooks Send your comments in an email to:
[email protected] Mail your comments to:
IBM Corporation, International Technical Support OrganizationDept. HYTD Mail Station P0992455 South RoadPoughkeepsie, NY 12601-5400 U.S.A.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business MachinesCorporation in the United States, other countries, or both. These and other IBM trademarked terms aremarked on their first occurrence in this information with the appropriate symbol ( or ), indicating USregistered or common law trademarks owned by IBM at the time this information was published. Suchtrademarks may also be registered or common law trademarks in other countries. A current list of IBM
trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml
The following terms are trademarks of the International Business Machines Corporation in the United States,other countries, or both:
DS4000
IBM
Redbooks
Redbooks (logo)
Tivoli
The following terms are trademarks of other companies:
Disk Magic, and the IntelliMagic logo are trademarks of IntelliMagic BV in the United States, other countries,or both.
Microsoft, MS, and the Windows logo are trademarks of Microsoft Corporation in the United States, othercountries, or both.
Other company, product, or service names may be trademarks or service marks of others.
http://www.redbooks.ibm.com/http://www.ibm.com/redbooks/http://www.redbooks.ibm.com/contacts.htmlhttp://www.ibm.com/legal/copytrade.shtmlhttp://www.ibm.com/legal/copytrade.shtmlhttp://www.redbooks.ibm.com/contacts.htmlhttp://www.ibm.com/redbooks/http://www.ibm.com/redbooks/http://www.redbooks.ibm.com/