Download - Hitachi USPV Architecture and Concepts.pdf
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
1/103
Hitachi Universal Storage Platform V
Architecture and Concepts
A White Paper
By Alan Benway (Performance Measurement Group, Technical Operations)
Confident ial Hitachi Data Systems Internal Use Only
June 2009
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
2/103
Notices and DisclaimerCopyright 2009 Hitachi Data Systems Corporation. All rights reserved.
The performance data contained herein was obtained in a controlled isolated environment. Actual
results that may be obtained in other operating environments may vary significantly. While HitachiData Systems Corporation has reviewed each item for accuracy in a specific situation, there is noguarantee that the same or similar results can be obtained elsewhere.
All designs, specifications, statements, information and recommendations (collectively, "designs") inthis manual are presented "AS IS," with all faults. Hitachi Data Systems Corporation and its suppliersdisclaim all warranties, including without limitation, the warranty of merchantability, fitness for aparticular purpose and non-infringement or arising from a course of dealing, usage or trade practice.In no event shall Hitachi Data Systems Corporation or its suppliers be liable for any indirect, special,consequential or incidental damages, including without limitation, lost profit or loss or damage todata arising out of the use or inability to use the designs, even if Hitachi Data Systems Corporationor its suppliers have been advised of the possibility of such damages.
Universal Storage Platform
is a registered trademark of Hitachi Data Systems, Inc. in the UnitedStates, other countries, or both.
Other company, product or service names may be trademarks or service marks of others.
This document has been reviewed for accuracy as of the date of initial publication. Hitachi DataSystems Corporation may make improvements and/or changes in product and/or programs at anytime without notice.
No part of this document may be reproduced or transmitted without written approval from HitachiData Systems Corporation.
WARNING: This document can only be used as HDS internal documentation for informationalpurposes only. This documentation is not meant to be disclosed to customers or discussed withouta proper non-disclosure agreement (NDA).
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
3/103
Document Revision Level
Revision Date Description
1.0 December 2007 Initial Release
1.1 April 2008 Updates
2.0 February 2009 Major revision - concepts additions, updates
2.1 June 2009 Changes to MP Workload Sharing, eLUNs, ePorts, Cache Mode, and
Concatenated Array Groups discussions, additional tables
Reference
Hitachi Universal Storage Platform V Performance Summary 08212008
ContributorsThe information included in this document represents the expertise, feedback, and suggestions of anumber of skilled practitioners. The author would like to recognize and thank the following reviewersof this document:
Gil Rangel, Director of Performance Measurement Group - Technical Operations
Dan Hood, Director, Product Management, Enterprise Arrays
Larry Korbus, Director, Product Management, HDP, UVM, VPM features
Ian Vogelesang, Performance Measurement Group - Technical Operations
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
4/103
Table of Contents
Int roduct ion .......................................................................................................................................................................... 7
Glossary ........................................................................................................................................ 7
Overview of Changes ......................................................................................................................................................... 10
Software Overview ...................................................................................................................... 10
Hardware Overview .................................................................................................................... 11
Processor Upgrade ................................................................................................................................................ 12
Cache and Shared Memory ................................................................................................................................... 12
Features and PCBs ................................................................................................................................................ 13
Architecture Detail s ............................................................................................................................................................ 14
Summary of Installable Hardware Features ................................................................................ 15
Logic Box Details ........................................................................................................................ 15
Memory Systems Details ............................................................................................................ 16
Shared Memory (SMA) .......................................................................................................................................... 17
Cache Switches (CSW) and Data Cache (CMA) .................................................................................................... 18
Data Cache Operations Overview .......................................................................................................................... 20
Random I/O Cache Operations .............................................................................................................................. 20
Sequential I/O Cache Operations and Sequential Detect ...................................................................................... 22
BED and FED Local RAM ...................................................................................................................................... 23
Front-End Director Concepts ...................................................................................................... 23
FED FC-8 port, ESCON, and FICON: Summary of Features ................................................................................. 25
FED FC-16 port Feature......................................................................................................................................... 25
I/O Request Limits and Queue Depths (Open Fibre) ............................................................................................. 26
MP Distributed I/O (Open Fibre) ............................................................................................................................. 27
External Storage Mode I/O (Open Fibre)................................................................................................................ 27
Front-End Director Board Details ................................................................................................ 28
Open Fibre 8-Port Feature ..................................................................................................................................... 28
Open Fibre 16-Port Feature ................................................................................................................................... 29
ESCON 8-port Feature ........................................................................................................................................... 30
FICON 8-port Feature ............................................................................................................................................ 30
Back-end Director Concepts ....................................................................................................... 31
BED Feature Summary .......................................................................................................................................... 31
Back-End-Director Board Details ................................................................................................ 32
BED Details ............................................................................................................................................................ 32
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
5/103
Back End RAID level Organization ......................................................................................................................... 32
Universal Storage Platform V: HDU and BED Associations by Frame ....................................... 33
Disk Details ................................................................................................................................. 35
SATA DISKS .......................................................................................................................................................... 36
HDU Switched Loop Details ........................................................................................................ 38
Universal Storage Platform V Configuration Overviews ................................................................................................. 39
Small Configuration (2 FEDs, 2 BEDs) ....................................................................................... 39
Midsize Configuration (4FEDs, 4 BEDs) ..................................................................................... 40
Large Configuration (8 FEDs, 8 BEDs) ....................................................................................... 41
Provisioning Managing Storage Volumes ..................................................................................................................... 42
Traditional Host-based Volume Management ............................................................................. 42
Traditional Universal Storage Platform Storage-based Volume Management ............................ 43
Dynamic Provisioning Volume Management .............................................................................. 45
Usage Overview ..................................................................................................................................................... 46
Hitachi Dynamic Provisioning Pools............................................................................................ 46
V-VOL Groups and DP Volumes ................................................................................................ 48
Usage Mechanisms ................................................................................................................................................ 50
DPVOL Features and Restrictions ......................................................................................................................... 51
Pool Page Details ................................................................................................................................................... 51
Pool Expansion ...................................................................................................................................................... 53
Miscellaneous Hitachi Dynamic Provisioning Details ............................................................................................. 53
Hitachi Dynamic Provisioning and Program Products Compatibility ....................................................................... 54
Universal Storage Platform V Volume Flexibility Example ..................................................................................... 55
Storage Concepts ............................................................................................................................................................... 56
Understand Your Customers Environment ................................................................................ 56
Disk Types .................................................................................................................................. 56
RAID Levels ................................................................................................................................ 57
Parity Groups and Array Groups ................................................................................................. 59
RAID Chunks and Stripes ........................................................................................................... 59
LUNS (host volumes) .................................................................................................................. 60
Number of LUNs per Parity Group .............................................................................................. 60
Port I/O Request Limits, LUN Queue Depths, and Transfer sizes .............................................. 60
Port I/O Request Limits .......................................................................................................................................... 60
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
6/103
Port I/O Request Maximum Transfer Size .............................................................................................................. 61
LUN Queue Depth .................................................................................................................................................. 61
Mixing Data on the Physical Disks .............................................................................................. 62
Workload Characteristics ............................................................................................................ 62
Selecting the Proper Disk Drive Form Factor .............................................................................. 63
Mixing I/O Profiles on the Physical Disks .................................................................................... 63
Front-end Port Performance and Usage Considerations ............................................................ 63
Host Fan-in and Fan-out ........................................................................................................................................ 63
Mixing I/O Profiles on a Port ................................................................................................................................... 64
Summary ............................................................................................................................................................................. 64
Appendix 1. Universal Storage Platform V (Frames, HDUs, and Ar ray Groups). ......................................................... 65
Appendix 2. Open Systems RAID Mechanisms ............................................................................................................... 66
Appendix 3. Mainframe 3390x and Open-x RAID Mechanisms ...................................................................................... 68
Appendix 4. BED: Loop Maps with Ar ray Group Names ................................................................................................. 71
Appendix 5. FED: 8-Por t Fib re Channel Maps wi th Processors .................................................................................... 72
Appendix 6. FED: 16-Por t Fibre Channel Maps w ith Processors .................................................................................. 73
Appendix 7. FED: 8-Por t FICON Maps w ith Processors ................................................................................................. 74
Appendix 8. FED: 8-Port ESCON Maps w ith Processo rs ............................................................................................... 75
Appendix 9. Veri tas Volume Manager Example .......................................................................................................... 76
Appendix 10. Pool Details ................................................................................................................................................. 77
Appendix 11. Pool Example .............................................................................................................................................. 78
Appendix 12. Disks - Physical IOPS Details .................................................................................................................... 79
Appendix 13. File Systems and Hitachi Dynamic Provis ioning Thin Provi sioning ..................................................... 82
Appendix 14. Putting SATA Drive Performance into Perspect ive ................................................................................. 83
Appendix 15. LDEVs, LUNs, VDEVs and More ................................................................................................................. 87
Internal VDEV ........................................................................................................................................................ 92
External VDEV ....................................................................................................................................................... 92
CoW VDEV ............................................................................................................................................................ 93
Dynamic Provisioning VDEV .................................................................................................................................. 93
Layout of internal VDEVs on the parity group ........................................................................................................ 94
Appendix 16. Concatenated Par ity Groups ...................................................................................................................... 98
Advantages of concatenated Parity Groups ......................................................................................................... 100
Disadvantages of concatenated Parity Groups: ................................................................................................... 101
Using LDEVs on concatenated Parity Groups as Dynamic Provisioning pool volumes ....................................... 101
Summary of the characteristics of concatenated Parity Groups (VDEV interleave): ............................................ 102
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
7/103
RESTRICTEDCONFIDENTIAL Page 7
Hitachi Universal Storage Platform V
Architecture and Concepts
A White Paper
By Alan Benway (Performance Measurement Group, Technical Operations)
IntroductionThisdocumentcoversthearchitectureandconceptsoftheHitachiUniversal Storage Platform V.The
conceptsandphysicaldetails regardingeachhardware featureare covered in the following sections.
However,thisdocumentisnotintendedtocoveranyaspectsofprogramproducts,databases,customer
specificenvironments,
or
new
features
available
by
the
second
general
release.
Areas
not
covered
by
thisdocumentinclude:
TrueCopy/ShadowImage/UniversalReplicatorDisasterRecoverySolutions
Host LogicalVolumeManagementGeneralguidelines
HitachiDynamicLinkManager(HDLM)Generalguidelines
UniversalVolumeManagement(UVM)virtualizationofexternalstorageforData
LifecycleManagement(DLM)
VirtualPartitionManagement(VPM)Generalguidelinesforworkloadmanagement
OracleGeneralguidelinesforstorageconfiguration
MicrosoftExchangeGeneralguidelinesforstorageconfiguration
Thisdocument is intendedtofamiliarizeHitachiDataSystemssalespersonnel,technicalsupportstaff,
customers,andvalueaddedresellerswiththefeaturesandconceptsoftheUniversalStoragePlatform
V.Theusersthatwillbenefitfromthisdocumentarethosewhoalreadypossessanindepthknowledge
oftheHitachiTagmaStoreUniversalStoragePlatformarchitecture.
GlossaryThroughout thispaper the terminologyusedbyHitachiDataSystems notHitachi willbenormally
used.Assomestorageterminology isuseddifferently inHitachidocumentationorbytheusers inthe
field,herearesomedefinitionsasusedinthispaper:
ArrayGroupThetermusedtodescribeasetoffourphysicaldiskdrivesinstalledasagroupin
thesubsystem.
When
aset
of
one
or
two
Array
Groups
(four
or
eight
disk
drives)
is
formatted
usingaRAID level,theresultingformattedentityiscalledaParityGroup. Althoughtechnically
thetermArrayGroupreferstoagroupofbarephysicaldrives,andthetermParityGrouprefers
to something thathasbeen formattedasaRAID level and therefore actuallyhasparitydata
(hereweconsideraRAID10mirrorcopyasparitydata),beawarethatthistechnicaldistinction
isoftenlostandyouwillseethetermsParityGroupandArrayGroupusedinterchangeably.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
8/103
RESTRICTEDCONFIDENTIAL Page 8
BED Backend Director feature; the pair of Disk Adapter (DKA) PCBs providing 4 pairs of
backendFibreChannelloops.Pairsoffeatures(4PCBs,or8pairsofloops)arealwaysinstalled.
CHAHitachisnamefortheFED.
CMACacheMemoryAdapterboard (81064MB/secports,4banksofRAMusing16DDR2
DIMMslots)
ConcatenatedParityGroup AconfigurationwheretheVDEVscorrespondingtoapairofRAID
10 (2D+2D)orRAID5 (7D+1P)ParityGroups,oraquadofRAID5 (7D+1P)ParityGroups,are
interleavedonaRAIDstripebyRAIDstripe,round robinbasisontheirunderlyingdiskdrives.
ThishastheeffectofdispersingI/Oactivityovertwiceorfourtimesthenumberofdrives,butit
doesnotchangethenumber,names,orsizeofVDEVs,andhenceitdoesn'tmakeitpossibleto
assignlargerLDEVs. NotethatweoftenrefertoRAID10(4D+4D),butthisisactuallytwoRAID
10(2D+2D)ParityGroupsinterleavedtogether. Foramorecomprehensiveexplanationseethe
ConcatenatedParityGroupsectionintheappendix.
CSWCacheSwitchboard(161064MB/secports),a2DversionofaHitachiSupercomputer3D
crossbarswitch
with
nanosecond
latencies
(port
to
port)
DKCHitachisname forthebasecontrol frame (orrack)wheretheLogicBoxandupto128
disksarelocated.
DKAHitachisnamefortheBED.
DPVOL aDynamicProvisioning (DP)Volume; theVirtualVolume from anHDPPool. Some
documentsrefertothisasaVVOL.ItisamemberofaVVOLGroup.EachDPVOLhavingauser
specifiedsizebetween8GBand4TB.
Featureaninstallablehardwareoption(suchasFED,BED,CSW,CMA,SMA)thatincludestwo
PCBs(oneperpowerdomain).NotethattwoBEDfeaturesmustbeinstalledatatime.
FEDFrontendDirectorfeature;thepairofChannelAdapter(CHA)PCBsusedtoattachhosts
tothestoragesystem,providingOpenFC,FICON,orESCONattachment.
HDU(HardDiskUnit)the64diskcontainerinaframethathas32diskslotsonthefrontside
and32moreontheback.TheHDU isfurthersplit into leftandrighthalves,eachonseparate
powerdomains.ThereareuptofourHDUsperexpansionframe,andtwointhecontrolframe.
LDEV(LogicalDevice)A logicalvolume internaltothesubsystemthatcanbeusedtocontain
customer data. LDEVs are uniquely identified within the subsystem using a six hex digit
identifierintheformLDKC:CU:LDEV. LDEVsarecarvedfromaVDEV(seeVDEV),andthusthere
arefourtypesofLDEVs internalLDEVs,externalLDEVs,COWVVOLs,andDPVOLs. LDEVsmay
bemapped
to
ahost
as
aLUN,
either
as
asingle
LDEV,
or
as
aset
of
up
to
36
LDEVs
combined
in
the form of a LUSE. Note: what is called an LDEV in HDS enterprise subsystems like the
UniversalStoragePlatformV iscalledanLUorLUN inHDSmodularsubsystems like theAMS
family,andwhatiscalledaLUNinHDSenterprisesubsystemsiscalledanHLUNinHDSmodular
subsystems.
LUN(LogicalUnitNumber)thehostvisibleidentifierassignedbytheusertoanLDEVtomake
itvisibleonahostport.AninternalLUNhasanominalQueueDepthlimitof32,andanexternal
(virtualized)LUNhasaQueueDepthlimitof2128(Adjustable).
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
9/103
RESTRICTEDCONFIDENTIAL Page 9
eLUN(externalLUN)anexternalLUN Isonewhich is located inanotherstoragearraythat is
attachedvia twoormoreePortsonaUniversalStoragePlatformVandaccessedby thehost
throughaUniversalStoragePlatformV.
ePort (externalPort)AnexternalarrayconnectsviatwoormoreUniversalStoragePlatformV
FibreChannelFEDportsontheUniversalStoragePlatformV instead toahost.TheFEDports
usedin
this
manner
are
changed
from
ahost
target
into
an
initiator
port
(or
external
Port)
by
useoftheUniversalVolumeManagerSoftwareproduct.TheUniversalStoragePlatformVwill
discover any exported LUNson each ePort, and thesewillbe configured as eLUNsusedby
hostsattachedtotheUniversalStoragePlatformV.
LUSE (LogicalUnitSizeExpansion)Aconcatenation (spillandfill)of2to36LDEVs(uptoa
60TBlimit)thatisthenpresentedtoahostasasingleLUN.ALUSEwillnormallyperformatthe
levelofjustoneoftheseLDEVs.
MP(microprocessor)theCPUusedontheFEDsandBEDs.AlsocalledCHP(FED)orDKP(BED)
ParityGroupasetofoneortwoArrayGroups(asetof4or8diskdrives)formattedasaRAID
level,either
as
RAID
10
(often
referred
to
as
RAID
1in
HDS
documentation),
RAID
5,
or
RAID
6.
TheUniversal StoragePlatformV'sParityGroup types areRAID10 (2D+2D),RAID5 (3D+1P),
RAID5(7D+1P),andRAID6(6D+2P). InternalLDEVsarecarvedfromtheVDEV(s)corresponding
to the formatted space inaParityGroup, and thus themaximum sizeof an Internal LDEV is
determinedbythesizeoftheVDEVitiscarvedfrom. ThemaximumsizeofaninternalVDEVis
approximately2.99TB. IftheformattedspaceinaParityGroupisbiggerthan2.99TB,thenas
manymaximumsizeVDEVsaspossibleareassigned,andthentheremainingspace isassigned
as the last, smaller VDEV. Note that there actually is no 4+4 Parity Group type see
ConcatenatedParityGroup.
PCBprintedcircuitboard;aninstallableboard(adapter).TherearetwoPCBsperFeature.
PDEV(Physical
DEVice)
aphysical
internal
disk
drive.
SMA SharedMemory Adapter board (64 150MB/sec ports, 2 banks of RAM using 8DDR2
DIMMslots,upto8GB)
SVPServiceProcessor(aPCrunningWindowsXP)installedinthecontrolrack
VDEVThelogicalcontainerfromwhichLDEVsarecarved. TherearefourtypesofVDEVs:
o InternalVDEV(2.99TBmax):mapstotheformattedspacewithinaparitygroupthatis
availabletostoreuserdata. LDEVscarvedfromaparitygroupVDEVarecalledinternal
LDEVs.
o External
storage
VDEV
(2.99TB
max):
maps
to
a
LUN
on
an
external
(virtualized)
subsystem. LDEVscarvedfromexternalVDEVsarecalledexternalLDEVs.
o CopyonWrite (CoW)VDEV (2.99TBmax): called a "VVOL group", and LDEVs carved
fromaCoWVVOLgrouparecalledaCoWVVOLs
o DynamicProvisioning(DP)VDEV(4TBmax):calledaVVOLgroup,andLDEVscarved
fromaDPVVOLgrouparecalledaDPVOLs(DynamicProvisioningVolumes).
VVOLGrouptheorganizationalcontainerofeitheraDynamicProvisioningVDEVoraCopyon
WriteVDEV.WithDynamicProvisioning,itisusedtoholdoneormoreDPVols.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
10/103
RESTRICTEDCONFIDENTIAL Page 10
Overview of ChangesExpandingontheprovenandsuperiorHitachiTagmaStoreUniversalStoragePlatform technology,the
Universal Storage PlatformV offers a new levelof Enterprise Storage, capableofmeeting themost
demanding ofworkloadswhilemaintaining great flexibility. TheUniversal Storage Platform V offers
muchhigherperformance,higherreliability,andgreaterflexibilitythananycompetitiveofferingtoday.
Theseare
the
new
features
that
distinguish
the
completely
revamped
Universal
Storage
Platform
V
from
thepreviousUniversalStoragePlatformmodels:
Software
o HitachiDynamicProvisioning(HDP)volumemanagementfeature.
Hardware
o EnhancedSharedMemorysystem(upto24GBand256paths@150MB/s).
o Faster,newergeneration800MHzNECRISCprocessorsonFEDs(ChannelAdapters)
andBEDs(DiskAdapters).
o FEDsMPshaveaMPWorkloadSharingfunction(perPCB)
o BEDshave4Gbit/sbackenddiskloops.
o SwitchedFCALLoopinterfacebetweenBEDsanddisks.
o HalfsizedPCBsusednow(exceptforSharedMemoryPCBs),allowingformoreflexibleFEDconfigurationcombinations.
o Supportforinternal1TB7200RPMSATAIIdisks.
o SupportforFlashDrives.
Software OverviewTheUniversalStoragePlatformVsoftware includesHitachi Dynamic Provisioning,amajornewOpen
Systemsvolumemanagement feature thatwill allow storagemanagersand systemadministrators to
moreefficientlyplanandallocatestoragetousersorapplications.Thisnewfeatureprovidestwonew
capabilities: thin provisioning and enhanced volume performance. Hitachi Dynamic Provisioning
providesfor
the
creation
of
one
or
more
Hitachi
Dynamic
Provisioning
Pools
of
physical
space
(each
Pool
assignedmultipleLDEVsfrommultipleParityGroupsofthesamedisktypesandRAIDlevel),andforthe
establishment of DP volumes (DPVOLs, or virtual volumes) that are connected to a single Hitachi
DynamicProvisioningPool.
Thin provisioning comes from the creation of DPVOLs of a userspecified logical size without any
correspondingallocationofphysicalspace.Actualphysicalspace(allocatedas42MBPoolpages)isonly
assigned to aDPVOL from the connectedHitachiDynamic Provisioning Pool as thatDPVOLs logical
spaceiswrittenbythehosttoovertime.ADPVOLdoesnothaveanyPoolpagesassignedtoitwhenitis
firstcreated.Technically, itneverdoes thepagesareloanedout from itsconnectedPool to that
DPVOLuntilthevolume isdeletedfromthePool.Atthatpoint,allofthatDPVOLsassignedpagesare
returnedtothePoolsFreePageList.CertainindividualpagescanbefreedorreclaimedfromaDPVOL
usingfacilitiesoftheUniversalStoragePlatformV.
ThevolumeperformancefeatureisanautomaticresultfromthemannerinwhichtheindividualHitachi
Dynamic Provisioning Pools are created. A Pool is created using 21024 LDEVs (Pool Volumes) that
providethephysicalspace,andthePoolallocates42MBPoolpagesondemandtoanyoftheDPVOLs
connectedtothatPool.Eachindividual42MBPoolpageisconsecutivelylaiddownonawholenumber
ofRAID stripes fromonePoolVolume.Otherpagesassignedover time to thatDPVOLwill randomly
originatefromthenextfree42MBPagefromoneoftheotherPoolVolumesinthatPool.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
11/103
RESTRICTEDCONFIDENTIAL Page 11
Asanexample,assumethatthereare12LDEVsfromtwelveRAID10(2D+2D)ParityGroupsassignedto
aHitachiDynamicProvisioningPool.All48disksinthatPoolwillcontributetheirIOPSandthroughput
powertoalloftheDPVOLsconnectedtothatPool.IfmorerandomreadIOPShorsepowerwasdesired
for that Pool, then it couldhavebeen, forexample, createdwith 16 LDEVs from 16RAID5 (7D+1P)
ParityGroups,thusproviding128disksofIOPSpowertothatPool.
Asup
to
1024
LDEVs
may
be
assigned
as
Pool
Volumes
to
asingle
Pool,
this
would
provide
a
considerable amountof I/Opower to thatPoolsDPVOLs.This typeofaggregationofdiskswasonly
possiblepreviouslybytheuseofoftenexpensiveandsomewhatcomplexhostbasedvolumemanagers
(suchasVERITASVxVM)oneachof theattachedservers.On theUniversalStoragePlatform (and the
Universal Storage Platform V), the only alternative would be to build Striped Parity Groups
(ConcatenatedParityGroupsseeAppendix16)usingtwoorfourRAID5(7D+1P)ParityGroups.This
wouldprovideeither16or32disksunder thevolumes (VDEVs)created there.There isalso theLUSE
option, but that is merely a simple concatenation of 236 LDEVs, a spillandfill capacity (not
performance)configuration.
AcommonquestionisHowdoestheperformanceofDPVOLsdifferfromtheuseofStandardVolumes
when using the same number of disks?Consider thisexample.Say thatyouareusing32disks in8
ParityGroups
as
RAID
10
(2D+2D)
with
8LDEVs,
all
used
as
Standard
Volumes
over
4host
paths.
ComparethistoaHitachiDynamicProvisioningPoolwiththosesame8LDEVsasPoolVolumes,with8
DPVOLs configured against that Pool andused over 4 hostpaths. If the hosts are applying aheavy,
uniform, concurrent workload to all 8 Standard Volumes, then the server will see about the same
aggregateIOPScapacityaswouldbeavailablefromthe8DPVOLswiththesameworkload.However,if
theworkloads per volume (StandardorDPVOL) arenotuniform,or see intermittentworkloadsover
time,theDPVOLswillalwaysdeliveraconstantandmuchhigherIOPScapacitythanwillthe individual
StandardVolumes.Ifonlyfourvolumes(StandardorDPVOL)weresimultaneouslyactive,thenthefour
StandardVolumeswouldonlyhavea16disk IOPScapacitywhilethefourDPVOLswouldalwayshave
the full32disk IOPScapacity.Anotherwaytoseethis isthattheuseofStandardVolumescaneasily
leadtohotspots,whereastheuseofDPVOLswillmostlyavoidthem.
Hardware Overview
Figure 1. Frame and HDU Map
HDU-4L HDU-4R HDU-4L HDU-4R HDU-2L HDU-2R HDU-4L HDU-4R HDU-4L HDU-4R16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front
16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear
HDU-3L HDU-3R HDU-3L HDU-3R HDU-1L HDU-1R HDU-3L HDU-3R HDU-3L HDU-3R16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front
16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear
HDU-2L HDU-2R HDU-2L HDU-2R HDU-2L HDU-2R HDU-2L HDU-2R
16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front
16 HDD rear 16 HDD r ear 16 HDD r ear 16 HDD rear LOGIC BOX 16 HDD rear 16 HDD r ear 16 HDD r ear 16 HDD rear
HDU-1L HDU-1R HDU-1L HDU-1R HDU-1L HDU-1R HDU-1L HDU-1R16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front 16 HDD front
16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear 16 HDD rear
DKU-R1 DKU-R2DKU-L2 DKU-L1 DKC
R2 FrameL2 Frame L1 Frame Control Frame R1 Frame
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
12/103
RESTRICTEDCONFIDENTIAL Page 12
WhiletherearemanyphysicallyapparentchangestotheUniversalStoragePlatformVchassisandPCB
cards from thepreviousUniversalStoragePlatformmodel, therearealsoanumberofnotsoevident
internalconfigurationchangesthatanSEmustbeawareofwhenlayingoutasystem.
The Universal Storage Platform V is a collection of frames, HDUs, PCB cards in a Logic Box, FCAL
switchesanddisks (seeFigure1).The frames include thecontrol frame (DKC)and thediskexpansion
frames(DKUs).
Disks
are
added
to
the
64
disk
HDU
containers
(up
to
18
such
HDUs)
in
sets
of
four
(the
ArrayGroup).ArrayGroupsareinstalled(followingacertainupgradeorder)intospecificHDUdiskslots
oneitherthe leftorrighthalf (suchasHDU1RorHDU1L)and frontandrearofanHDUbox.Setsof
HDUsarecontrolledbyoneoftheBEDs.
Processor Upgrade
Theprocessor(MP)usedontheFEDs(theMPsarealsocalledCHPs)andBEDs(theMPsarealsocalled
DKPs) has been improved and its clock speed has been doubled. The quantities of processors
representedinTable1arevaluesperPCBbyfeature.AfeatureisdefinedasapairofPCBswhereeach
board is locatedona separatepowerboundary.As thereare twiceasmanyFED/BED features in the
UniversalStorage
Platform
V
as
compared
to
the
Universal
Storage
Platform,
the
overall
processor
count
forFEDsandBEDsremainsthesamebuttheavailableprocessingpowerhasbeendoubled.
Table 1. Universal Storage Platform V Processor Enhancements
MPsperPCB
Feature USP USPV
FEDESCON8port 2x400MHz
FEDESCON16port 4x400MHz
FEDFICON8port 4x800MHz
FEDFICON16port 8x400MHz 4x800MHz
FEDFC8port 4x800MHz
FEDFC16port 4x400MHz 4x800MHz
FEDFC32port 8x400MHz
BEDFCAL8port 4x800MHz
BEDFCAL16port 8x400MHz
Cache and Shared Memory
TheDataCachesystemcarriesoverthesamepathspeeds(1064MB/s)andcounts(upto64paths)from
theUniversalStoragePlatform,withapeakwirespeedbandwidthof68GB/s.TheSharedMemory(or
ControlMemory)
system
has
been
significantly
upgraded
over
the
Universal
Storage
Platform,
with
256
paths (up from192)operatingat150MB/s (up from83MB/s),withapeakbandwidthof38.4GB/s (up
from15.9GB/s).OntheUniversalStoragePlatform,theFEDPCBshad8SharedMemorypaths,andthe
BEDshad16.NowallofthePCBshave16SharedMemorypaths.
NotethattheDataCachesystemcontainsonlytheactualuserdatablocks.TheSharedMemorysystem
holdsallofthemetadataaboutthe internalParityGroups,LDEVs,externalLDEVs,andruntimetables
forvarioussoftwareproducts.Therecanbeupto512GBofDataCacheand24GBofSharedMemory.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
13/103
RESTRICTEDCONFIDENTIAL Page 13
Features and PCBs
Logicboards(PCBs)are installed inthefrontandrearslots intheLogicBox intheControlFrame.The
logicboardtypesfortheUniversalStoragePlatformVincludethefollowing(asfeatures,eachapairof
PCBs):
CSWCacheSwitch1to4features(2to8PCBs)
CMACacheMemory1to4features(2to8PCBs)
SMASharedMemory 1or2features(2to4PCBs)
FEDFrontendDirector(orChannelAdapter)1to14features(2to28PCBs)
BEDBackendDirector(orDiskAdapter)2,4,6,or8features(4,8,12,16PCBs)of4Gb/secFibreChannelloops
TheUniversalStoragePlatformVsnew halfsized PCBs (now installed inupperand lower, frontand
rear slots in the LogicBox,described later) allow for a less costly,more incrementalexpansionof a
system.Also,theBEDslotswillalsosupporttheuseofadditionalFEDPCBs(but2BEDfeaturesmustbe
installedataminimum).Forexample,therecouldbe14FEDfeatures(28PCBs)installedinaUniversal
Storage
Platform,
and
they
could
be
any
mixture
of
Open
Fibre,
ESCON,
FICON,
and
iSCSI.
However,
this
gaveyoualargenumberofportsofasingletypethatyoumaynotneed,withasubstantialreductionof
otherporttypesthatyoumayneedtomaximize.Withthenewhalfsizedcards,youcanhave18FED
features(orupto14FEDfeaturesifthenumberofBEDfeaturesisreducedtojust2),usinganymixture
(perfeature)ofthe interfacetypesasbefore.Now,astherearehalfasmanyportsperboard,smaller
numbersoflesserusedporttypesmaybeinstalled.FeaturesarestillinstalledaspairsofPCBcardsjust
aswiththeUniversalStoragePlatform.
Table 2. Summary of Limits, Universal Storage Platform to Universal Storage Platform V
Limits USP USPV
DataCache(GB) 128 512
RawCacheBandwidth 68GB/sec 68GB/sec
SharedMemory(GB) 12 24
SharedMemoryPaths(max) 192 256
RawSharedMemoryBandwidth 15.9GB/sec 38.4GB/sec
FibreChannelDisks 1152 1152
SATADisks 1152
LogicalVolumes 16k 64k
MaxInternalVolumeSize 2TB 2.99TB
MaxCoWVolumeSize 2TB 4TB
MaxExternalVolumeSize 2TB 4TB
IORequestLimitperFCFEDMP 2048 4096
NominalQueue
Depth
per
LUN
16
32
HDPPools 128
MaxPoolCapacity 1.1PB
MaxcapacityofallPools 1.1PB
LDEVsperPool 1024
DPVolumesperPool 8192
DPVolumeSizeRange 46MB 4TB
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
14/103
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
15/103
RESTRICTEDCONFIDENTIAL Page 15
Forquickreference,Figure2abovedepictsthefullyoptionedUniversalStoragePlatformVarchitecture.
OtherUniversalStoragePlatformVconfigurationsaredetailedfurtherdown.Thisarchitectureincludes
64x1064MB/secDatapaths representing68GB/secofdatabandwidth and256x150MB/secControl
paths representing38.4GB/secofmetadata and controlbandwidth.When comparing theUniversal
StoragePlatformVtoothervendorsmonolithiccachesystems(suchasEMCDMX4),theaggregateof
theUniversal Storage PlatformVsData +Controlbandwidthsmustbeused for an applestoapples
comparison.The
throughput
aggregate
for
the
fully
optioned
system
is
afully
usable
106.4GB/s.
This
includes anyoverhead forblockmirroringoperations thatoccur inboth the Data Cache and Shared
Memorysystems(discussedfurtherdown).
Summary of Installable Hardware FeaturesThetablesbelowshowoverviewsoftheavailablefeaturesforboththeUniversalStoragePlatformand
theUniversalStoragePlatformV.ExceptfortheSharedMemoryPCBs,alloftheotherPCBsarenowhalf
sized. The lower two tables show the tradeoff between installedUniversal Storage PlatformV BED
featuresandavailableUniversalStoragePlatformVFEDfeatures.Ifsomeoneneedsa largenumberof
ports(and
reduced
amounts
of
internal
disks
and
performance)
for
virtualization,
additional
FED
featuresmaybeinstalledintheplaceofupto6BEDfeatures.
Table 3. Comparison: Universal Storage Platform and Universal Storage Platform V Features
Features USP USP V
FEDs 17 114
BEDs 14 2,4,6,8
Cache 2 4
SharedMemory 2 2
CacheSwitches 2 4
Logic Box DetailsTheLogicBoxchassis(mainDKCframe) iswhereallofthedifferenttypesofPCBsforthefeaturesare
installedinspecificslots(seeFigure3).ThislayoutisverydifferentfromthepreviousUniversalStorage
Platformmodel.AllofthePCBsarenowhalfsized,andthereareupperand lowerslots inadditionto
BED PAIRS FED #1 FED #2 FED #3 FED #4 FED #5 FED #6 FED #7 FED #8 FED #9 FED #10 FED #11 FED #12 FED #13 FED #14
Installed To ta l Por ts To tal Por ts Total Por ts Total Po rt s Tota l Po rt s Total Port s Total Port s Total Por ts Total Por ts Total Por ts To ta l Por ts To ta l Por ts Total Po rt s Total Port s
1 & 2 16 32 48 64 80 96 112 128 144 160 176 192 208 224
3 & 4 16 32 48 64 80 96 112 128 144 160 176 192
5 & 6 16 32 48 64 80 96 112 128 144 160
7 & 8 16 32 48 64 80 96 112 128
FED 16-Port Fibre Channel Kits Installed
BED PAIRS FED #1 FED #2 FED #3 FED #4 FED #5 FED #6 FED #7 FED #8 FED #9 FED #10 FED #11 FED #12 FED #13 FED #14
Installed To ta l Por ts To tal Por ts Total Por ts Total Po rt s Tota l Po rt s Total Port s Total Port s Total Por ts Total Por ts Total Por ts To ta l Por ts To ta l Por ts Total Po rt s Total Port s
1 & 2 8 16 24 32 40 48 56 64 72 80 88 96 104 112
3 & 4 8 16 24 32 40 48 56 64 72 80 88 965 & 6 8 16 24 32 40 48 56 64 72 80
7 & 8 8 16 24 32 40 48 56 64
FED 8-Port FICON or ESCON Channel Kits Installed
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
16/103
RESTRICTEDCONFIDENTIAL Page 16
thefrontandback layout.TheassociationsamongBEDs,FEDsandtheCSWs(cacheswitches)arealso
different.The factorynamesofthePCBtypesandtheiroptionnumbersareshownaswell.Notethat
FEDupgradefeaturesmayconsumeuptosixpairsofunusedBEDslots.
Figure 3. Logic Box Slots (BED slots 3-8 can be used for FEDs if desired)
Memory Systems DetailsTheUniversalStoragePlatformVmemory subsystemsareuniqueamongallenterprisearrayson the
market.Allotherdesignsuseasingleglobalcache space thatcontrols theoperationof thearray.All
accessesfordata,controltables,metadata,replicationsoftwaretablesandsuchallgoagainstthesame
commoncachesystem.Allof theseactivitiescompetewithoneanotheroverthesame internalpaths
forcachebandwidth.Assuch,cacheaccessisaprimarychokepointoncompetingdesigns.
TheUniversalStoragePlatformVhasfourparallelhighperformancememorysystemsthatisolateaccess
todata,metadata,controltablesandarraysoftware.Theseinclude:
SharedMemory
(Control
Memory)
system
for
metadata
and
control
tables
Datacache(datablocksonly)
LocalRAMpooloneachFEDandBEDPCB(upto32suchPCBsinanarray)forusebythefour
MP processors on each board (128 processors in a full array). This RAM is used for the
workspaceforeachMP,localdatablockcachingandlocalLUNmanagementtables.
NVRAMregionforeachMPthatholdsthemicrocodeandoptionalsoftwarepackages.
2XU 2WU 2VU 2UU 2TU 2CC 2SB 2CG 2RU 2QU 2PU 2NU 2MU
Opt 5 Opt 6 Opt 2 Opt 6 Opt 5 Opt 1 Opt 2 Opt 3 Opt 2 Opt 1 Basic Opt 2 Opt 1
2XL 2WL 2VL 2UL 2TL 2CD 2SD 2CH 2RL 2QL 2PL 2NL 2ML
Opt 7 Opt 8 Opt 3 Opt 8 Opt 7 Opt 2 Opt 1 Opt 4 Opt 4 Opt 3 Opt 1 Opt 4 Opt 3
1LU 1KU 1JU 1HU 1GU 1CE 1SA 1CA 1FU 1EU 1DU 1BU 1AU
Opt 5 Opt 6 Opt 2 Opt 6 Opt 5 Opt 3 Opt 1 Opt 1 Opt 2 Opt 1 Basic Opt 2 Opt 1
1LL 1KL 1Jl 1HL 1GL 1CF 1SC 1CB 1FL 1EL 1DL 1BL 1AL
Opt 7 Opt 8 Opt 3 Opt 8 Opt 7 Opt 4 Opt 2 Opt 2 Opt 4 Opt 3 Opt 1 Opt 4 Opt 3
LOWER
USP-V Rear (Cluster-2)
USP-V Front (Cluster-1)
UPPER
UPPER
LOWER
BED-1
BED-1
FED-4
FED-3
CSW-1
BED-3
FED-2
FED-1
CSW-0
BED-2
FED-4
FED-3
CSW-1
BED-4
BED-4
BED-3
FED-2
FED-1
CSW-0
BED-2
CMA-3
SMA-1
CMA-1
CMA-4
SMA-2
CMA-2
CMA-1
SMA-2
CMA-3
CMA-2
SMA-1
CMA-4
FED-5
BED-7
BED-8
CSW-3
FED-8
FED-7
BED-5
BED-6
CSW-2
FED-6
FED-5
BED-7
BED-8
CSW-3
FED-8
FED-7
BED-5
BED-6
CSW-2
FED-6
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
17/103
RESTRICTEDCONFIDENTIAL Page 17
Shared Memory (SMA)
The SMA subsystem is critical to achieving the Universal Storage Platform Vs very high array
performance as all control andmetadata information is containedwithin this system.Access to and
managementofalldatablocksintheDataCacheismanagedbytheSMAsystem.TheSMAsubsystemis
uniqueamongallenterprisearraysonthemarket.Allotherdesignsuseasingleglobalcacheforalldata,
metadata,controltables,andsoftware.Theseotherdesignscreateaserializationofaccess intocache
forallofthesecompetingoperations.TheSMAdesignremovesallofthenondataaccesstoDataCache,
allowingdatablockstobemovedunimpededatveryhighrates.
TheSMAsubsystem issizedaccordingtothearrayconfiguration(hardwareandsoftware).AUniversal
StoragePlatformVcanhaveupto24GBofSMAinstalledacrosstwofeatures(4PCBs).Asectionofeach
boardismirroredontotheotherboardofthefeatureforredundancy.
TheLogicBoxchassishasslotsforfourSMAPCBs.OneSMAfeatureisinstalledinthebasesystemand
one more feature (2 PCBs) is an upgrade option. Due to the way the Shared Memory subsystem
functions, theoptionalSMA featureshouldalwaysbe installed inasystem.Asdescribedbelow,each
FEDandBEDPCBhas8 paths intotheSharedMemorysubsystem. It is importanttoknowthatthese
pathsarehardwiredtospecificportsoneachSMAPCB.EachFEDorBEDPCBdirects2SMApathsto
each installedSMAPCB.Withall4SMAPCBs, thismeans thatall8SMApathsperFED/BEDPCBare
connected.
Figure 4. Shared Memory PCB
EachSMAPCBhas8DDR2333DIMMslots,organizedastwoseparatebanksofRAM(4slotseach).A
singleboard
can
support
up
to
8GB
of
RAM
when
using
either
1GB
or
4GB
DIMMs.
The
pair
of
boards
for
eachSMAfeatureisinstalledindifferentpowerdomainsintheLogicBox(Cluser1andCluster2).Each
PCB has 64 150MB/s SMA (control) paths. This provides for 9.6GB/s of bandwidth (wire speed) per
board.EachbankofRAMhasapeakbandwidthof5.19GB/s,oraconcurrent totalof10.38GB/sper
board.DuetotheveryhighspeedoftheRAMcomparedtotheindividualSMAports,withportbuffering
andportinterleaving,theRAMallowsthe64SMApathstooperateatfullspeed.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
18/103
RESTRICTEDCONFIDENTIAL Page 18
Foreach SMA feature, there is a small region that ismirroredonto theotherboard in thatpair for
redundancy.(Note:TheregionusedforHitachiDynamicProvisioningcontrolisbackedupontoaprivate
diskregiononsomePoolVolumes.)TheCluster1SMAboardsareassociatedwiththeCacheAregionof
theDataCache(explainedbelow),whiletheCluster2boardsareassociatedwiththeCacheBregion.
The basic configuration information for the array is located in the lower memory address range of
SharedMemory.
This
is
also
backed
up
on
the
NVRAM
located
on
each
FED
and
BED
PCB
in
the
system.
Themetadataand tables foroptionalsoftwareproducts (suchasHitachiDynamicProvisioning) inthe
other (higher)memory address regions in SharedMemory arebackedup to the internaldisk in the
Serviceprocessor(SVPPC)atarraypowerdown.
Cache Switches (CSW) and Data Cache (CMA)
EachUniversalStoragePlatformVhasup to8 Data Cacheboards (CMA)and8 Cache Switchboards
(CSW).ThecacheswitchesconnecttheindividualcacheboardstothecacheportsoneachFEDandBED
board.TheLogicBoxchassishasslotsfor8CacheMemory(CMA)and8CacheSwitch(CSW)PCBs.One
CMA featureandoneCSW featureare installed in thebase systemand threemore features (6PCBs
each)of
each
type
may
be
added
as
upgrades.
The
eight
1064MB/s
cache
ports
on
each
CMA
board
are
attachedtohalfoftheportsonthecachesideofcentralCacheSwitches.Theprocessorsideports
on theCache Switches are connected to FED andBED cacheports.Due to theway theDataCache
subsystemisinterconnectedthroughtheCacheSwitchsystem,anyCMAportonanyFEDorBEDboard
canaddressanycacheboard.Asdescribedbelow,eachFEDandBEDPCBhas2paths into theCache
Switchsubsystem.
CSW - Cache Switch
EachCacheSwitchboard(CSW)has16bidirectionalports(8tocacheports,8toFED/BEDports),each
portoperatingat1064MB/s.Thesecacheswitchesareveryhighperformancecrossbarswitchesasused
in high performance supercomputers and servers. In fact, these CSW boards are a 2D scaled down
versionof
a3D
switch
used
in
Hitachi
supercomputers.
The
CSW
was
designed
by
the
Hitachi
SupercomputerandHitachiNetworksDivisions.Theinternalporttoportlatencyofthisswitchisinthe
nanosecondrange.
ThenumberofCSWfeaturesneededdependspurelyonthenumberofCHA/DKAfeaturesinstalled.
AddingmoreCSWfeaturestoanexistingconfigurationonlypermitsmoreFED/BEDfeaturestobeinstalled
AddingmoreCSWfeaturesdoesnotaffecttheperformanceofexistingFED/BEDfeatures.
EachCSWboardcansustain81064MB/stransfersbetweenthe8FED/BEDpathsandthe8cacheports
foranaggregateof8.5GB/sperboard.AstheremaybeuptofourCSWfeaturesinstalledpersystem(8
CSW
PCBs),
there
can
be
64
cache
side
ports
switching
among
64
FED/BED
processor
ports.
This
provides for a total of 68GB/s per array of nonblocking, dataonly bandwidth to cache from the
FED/BEDboards atextremely small (nanosecond) latency rates.Noother vendorhas anonblocking
design(despitesomeunsupportableclaimstothecontrary),noraretheyabletoprovidesustaineddata
onlybandwidthatanywherenearthisrate.Thisbandwidth isnotconstrainedbythetypeofI/O(read
versuswrite)sinceallcachepathsarefullspeedandbidirectional.
Figure 5 illustrates aUniversal Storage Platform V arraywith only two CSW and two CMA features
installed.IfallfourCSWandCMAfeatures(8boardseach)were installed,theneachcachesidepath
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
19/103
RESTRICTEDCONFIDENTIAL Page 19
onaCSWPCBwouldgotooneoftheeightCSAPCBs.EveryCSWhasat leastonepathtoeveryCMA.
Farther down in this document aremaps of associations of FEDs and BEDs to the CSWs. There is a
certainrelationshipfromtheCSWprocessorsidetotheFEDandBEDCMAports.
Figure 5. Universal Storage Platform V with Two Cache Features and Two Cache Switch Features Installed.
CMA - Data Cache
Each CMA cache board has 16 DDR2400 DIMM slots, organized as four banks of RAM (thus 32
independentbanksinallfourfeatures).EachCMAboardcansupport864GBofRAMusingasinglesize
ofDIMMacrossall installedCMAboards.The sameamountofRAMmustbe installedoneachCMA
board.Thepairofboards foraCMA feature is installed indifferentpowerdomains in the LogicBox
(Cluster1andCluster2).
EachCMAhas81064MB/sbidirectionaldatapaths.Thisprovidesfor8.5GB/sofreadwritebandwidth
(wirespeed)perboard.EachbankofRAMhasapeakbandwidthof6.25GB/s,or25GB/sperboard.Due
to the very high speed of the RAM compared to the CMA ports, with port buffering and port
interleaving,theRAMallowsall8CMAcachepathsperboardtooperateatfullspeed.
Figure 6. Data Cache Board (one of a feature pair)
Cache Cache Cache Cache
Cache Side
Processor Side
CSW
Cache Side
Processor Side
CSW
Cache Side
Processor Side
CSW
Cache Side
Processor Side
CSW
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
20/103
RESTRICTEDCONFIDENTIAL Page 20
Data Cache Operations Overview
TheDataCacheisorganizedintotwocontiguousaddressrangesnamedCacheAandCacheB.TheRAM
located across the cluster1CMA boards is a single address range called CacheA,while that of the
cluster2boardsiscalledCacheB.TheaddressrangeofCacheBimmediatelyfollowstheendofCacheA
(illustratedbelow).Thisentirespaceisaddressableasunitsof2KBcacheblocks. AdatablockforanI/O
readrequestcanbeplaced ineitherCacherangebytheBEDMP (orFEDMP forexternalstorage I/O
operations).Writeoperationsaresimultaneouslyduplexed intobothCacheAandCacheBbytheFED
MPprocessingtherequest. Notethatonlytheindividualwriteblocksaremirrored,nottheentirecache
space.
Figure 7. Cache Address Space - Cache-A and Cache-B
Thedatacacheislogicallydividedupintological256KBcacheslotsacrosstheentirelinearcachespace
(theconcatenationofCacheAandCacheB). It isa singleuniformaddress space rather thana setof
isolatedpartitionedspaceswhereonlycertaindevicesaremappedtoindividualregionsonindependent
cacheboards.
Whencreated,eachUniversalStoragePlatformVLDEV(apartitionfromaParityGroup)isallocatedan
initial setof 16 cache slots forRandom I/Ooperations.A separate setof 24 cache slots is allocated
whenaFEDMPdetects sequential I/Ooperationsoccurringon thatLDEV.The initial setof16cache
slots(per
LDEV)
for
use
for
processing
Random
I/O
requests
can
be
temporarily
expanded
by
one
or
moreadditional setsof16 random I/O256KB slotsasneeded (ifavailable from theassociatedcache
partitionsfreespace).
Since individualLDEVscandynamicallygrowtovery largesizes intermsoftheircachefootprintbased
ontheworkloads,theuseofCachePartitioning(usingtheVirtualPartitionManagementpackage)can
be used to fence off the cache space usable by selected Parity Groups (with all of their LDEVs) to
managetheirmaximumallocatedrandomslotsets.
Random I/O Cache Operations
Each
logical
256KB
cache
slot
contains
four
logical
64KB
cache
segments.
For
each
cache
slot,
two
of
these64KB segmentsareonlyused for readoperations, and two areonlyused for writeoperations
(moreaboutthisdistinctionbelow). Furthermore,eachofthe64KBsegmentsisfurthersubdividedinto
four logical 16KB subsegments. Eachof these is thendivided into eightphysical 2KB Cache Blocks.
Therefore,each256KBcacheslotcontains642KBcacheblocksforrandomreadsandanother642KB
blocksforrandomwrites.AnycacheblockinCacheAandCacheBcanbeusedtobuildthelogical16KB
subsegments(managedbytablesinSharedMemory).
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
21/103
RESTRICTEDCONFIDENTIAL Page 21
InitialRandomCacheAllocationPerLDEV
256KBCacheSlot 16
64KBReadSegments 32
64KBWriteSegments 32
16KBReadSubsegments 128
16KBWritesubsegments 128
2KBReadCacheBlocks 512
2KBWriteCacheBlocks 512
Thetableaboveshowstheoverallbreakdownofelementsforasetof16randomI/Ocacheslotsforone
LDEV.Figures810belowshowtherelationshipamongtheseelements.
Figure 8. Random I/O: Logical Cache Slot and Segments Layout
Figure 9. Random I/O: Logical Segment and Sub-segment Layout
Figure 10. Random I/O: Logical Sub-segments and Physical Cache Blocks Layout
Itisbecauseofthesesmall2KBcacheblockallocationunitsthattheUniversalStoragePlatformVdoes
sowellwithvery smallblock sizesand randomworkloads. Ingeneral, randomworkloadsareusually
associatedwithapplicationblocksizesof2KBto32KB.Othervendorsuseasinglefixedcacheallocation
size (suchas32KBand64KB forEMCDMX) for individual I/Ooperations.Hence,a2KBhost I/Owill
waste30KB
or
62KB
of
each
EMC
DMX
cache
slot.
On
the
IBM
DS8000
series,
the
cache
allocation
unit
is
4KB,therebyavoidingwastingcachespace.ButtheDS8000arraysareactuallygeneralpurposeservers
runningtheAIXoperatingsystemusingthelocalsharedRAMonindividualprocessorbooks.Onboth
EMCandIBM,theirglobalcachecontainstheoperatingsystemspace,allstoragecontrolandmetadata,
all software,andalldatablocks.On theUniversalStoragePlatformV, theCMA cache system isonly
usedforuserdatablocks.
Write Operations
Inthecaseofhostwrites,theindividualdatablocksaremirrored(duplexed)toanotherCMAboardona
different power boundary (i.e. CacheA and CacheB). Themirroring is only for the actualuser data
blocks,unlikethestaticmirroringof50%ofcachetotheother50%ofcacheasisusedbyEMCforDMX
3and
DMX
4.
Forexample,ifahostapplicationwritesan8KBblockofdata,that8KBblockwillbewrittentotwoCMA
boards,using four2KB cacheblocks from a64KBWrite segment inCacheAand another setof four
cacheblocks fromCacheB.TheFEDMP thatowns theportonwhich thehost requestarrived is the
processorthatperformsthisduplexingtask.Afterthewritehasbeenprocessedanddestagedtodisk,
the 8KB of mirrored blocks will be deleted, and the other 8KB will remain in cache (after being
remappedtoaReadsubsegmentseebelow)forapossiblefuturereadhitonthatdata.
64KB Read segment 64KB Read segment 64KB Write Segment 64KB Write Segment
256KB Cache slot (random)
16KB Sub-seg 16KB Sub-seg 16KB Sub-seg 16KB Sub-seg64KB Segment
2KB block 2KB block 2KB block 2KB block 2KB block 2KB block 2KB block 2KB block
16KB Sub-Segment
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
22/103
RESTRICTEDCONFIDENTIAL Page 22
NotethatareadcachehitondatanotyetdestagedtodiskfromaWriteSegmentisnotpossible.Infact,
nohost readsaredirectedagainstWriteSegments.A read request forablockwhich stillhasawrite
pendingconditionwillforceadiskdestagefirst,andallpendingwritesinthatsame64KBsegmentwill
beincludedinthisforceddestage.Similarly,awritecachehitisnotpossibleatanytimeifawritehitis
meantasoverwritingapendingwriteblock.OntheUniversalStoragePlatformV,allblockswrittento
cachemustbedestagedtodiskintheordertheyarrived.Thereisnooverwritingofdirtyblocksnot
yetwritten
to
disks
as
is
the
case
with
amidrange
array.
On
the
Universal
Storage
Platform
V,
awrite
hit
actuallymeansthatthereisemptyspaceinoneofthatLDEVsexistingwritesegmentsalreadyavailable
foranewI/Orequest.Awritemisswouldthenmeanthatnosuchexistingspaceisavailable,andeither
anothersetof16 random I/Ocacheslotswillbeallocated foruseby thatLDEV,or (ifcacheslotsare
tight)allofthependingwrites inthatLDEVsallocated64KBWriteSegmentswillbedestaged,freeing
upallofitsrandomwritespace.
Write Pending Limit
There isawritepending limittriggerof30%and70%percachepartition (thebasepartition ifothers
havenotbeencreatedusingtheVirtualPartitionManagerproduct).Whenthe30%limitishit,thearray
beginsadjusting the internalpriorityofwriteoperationshigher than those for reads.When the70%
limitis
hit,
the
array
goes
into
an
urgent
level
of
data
destage
(writes
take
precedence
over
reads)
to
diskfromthose64KBrandomsegmentsusedforwrites.IfusingVPMandcachepartitions(andaglobal
modesettingof454=ONforthearray), inmostcasesonlythe64KBWriteSegmentswithinapartition
see thisurgentdestage (themode switch creates and averagingmechanism topreventallpartitions
from being similarly affected). Other partitions are generally isolated with their own write pending
triggers. Inmostcasesa70%writepending limit isan indicationoftoo fewdisksabsorbingthewrite
operations.
Read versus Write Segments
There isan interestingeventthathappensafteraWriteSegment isprocessedbyadestageoperation.
Allcacheslots,aswellastheirReadandWrite64KBsegments,subsegments,and2KBcacheblocksare
actuallydispersed
across
the
entire
cache
space
managed
at
the
2KB
cache
block
level.
These
2KB
cache
blocksaredynamicallyremappedtovariouscachesegments.AWritesegmentcannotbeusedforreads.
Sohowdoesa freshwritethathasbeendestagedbecomereadableasacachehit for followonread
requests?
Suppose8KBofdatawaswrittentoeight2KBcacheblocks(fourinoneWriteSegment,withfourmore
inaWriteSegmentmirror).Afterthediskdestageoperationoccurs,fourofthoseblocksaremarkedas
free (from themirroredWriteSegment)and fourare remapped intoaReadSegment (thusbecoming
readableforacachehit).ThesefoursurrenderedblocksfromthatWriteSegmentsaddressspaceare
replacedby fourunused2KB cacheblocks.Theuserdatawasnot copied,but the logical locationof
thoseblocksdidchange.AllofthisismappingmanagedinthecachemapinSharedMemorysystemand
occurs
at
RAM
speeds.
In
fact,
nearly
everything
in
the
Universal
Storage
Platform
V
(internal
LDEVs,
external LDEVS, HitachiDynamic Provisioning pages, etc.) is amapped entitywith a high degree of
flexibilityonhowtomanageit.
Sequential I/O Cache Operations and Sequential Detect
A separate setof 24 cache slots is allocated to an individual LDEVwhen a sequential I/Opattern is
detectedagainstthatLDEVonaportownedbyaFEDMP(knownassequentialdetect).Thissetof24
slots is released when a sequential I/O pattern is no longer detected. This is one reason a proper
sequentialdetectstate isimportantontheUniversalStoragePlatformV.IfsequentialI/O isbrokenup
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
23/103
RESTRICTEDCONFIDENTIAL Page 23
acrossseveralportsondifferentFEDs(forexample),itwilllooklikerandomI/Ototheseveralindividual
MPscontrollingthosehostports.Asanexample,theuseofahostbasedLogicalVolumeManagerthat
createsalargeRAID0volumebystripingacrossseveralLDEVsonseveraldifferenthostportscandefeat
thesequentialdetectionmechanism.Sequentialdetect ismanagedforeachLDEVcontrolledbyanMP
onaFEDfortheoneortwoFEDportsthatitowns.
Unlikethe
cache
slots
assigned
for
random
I/O,
each
256KB
cache
slot
is
used
as
asingle
segment
of
256KBforsequentialI/O. ThismatchestheRAIDchunksizeof256KBonthedisks.Insequentialmode,
anentire256KBchunk isread fromorwrittentoeachdisk.Sequentialprefetchwillreadseveralsuch
chunks intocacheslotswithoutbeing instructed todosoby thecontrollingMPon theFED involved.
Sequentialwriteswillbeperformedasfullstripewrites toaParityGroup, thusminimizing thewrite
penalty for RAID levels 5 and 6. The data in these sequential cache slots can be reusable for a
subsequentcachehit.Ingeneral,oncethedatahasbeenprocessed,thesedynamicallyallocatedcache
slotsarereleasedforthenextsequentialI/OoperationagainstthatLDEVorarereturnedtothecache
poolonce sequentialdetect isno longerpresent (for someperiodof time) for that LDEV.However,
certain lab testsdo show ahigh cachehit ratewherea smallnumberof LUNsunder testhavehigh
sequentialreadratiosanduselargeblocksizes(suchas1MB).
Figure 11. Sequential I/O Read Slot
Figure 12. Sequential I/O Write Slot
BED and FED Local RAM
ThereisapoolofRAMinstalledoneachFEDorBEDPCB.ThisRAMissharedbythefourMPprocessors
per
PCB.
This
is
where
the
LUN
management
tables
(such
as
MP
Workload
Sharing),
sequential
detect
histograms and limited local caching forhighly reuseddatablocks are located.There is alsoNVRAM
associatedwitheachMPforholdingthesystemmicrocodeandoptionalsoftwarepackages.Notethat
otherdesigns locatebothof these systems in amonolithicglobal cache systemandeveryaccess for
these elements interfereswithuserdatablockmovement.On theUniversal Storage PlatformV the
degreeofmemoryparallelismremovesthisburdenfromthedatacachesystem.
Whennew firmware is tobe flashed into theNVRAMownedby eachMP, thenew software is first
copied into the local RAM (over the internal MPtoSVP network), and then, onebyone, each MP
suspends itsactivities,copies thenewmicrocode into itsNVRAM,and then reboots.ThenextMPon
thatFEDorBEDboardwillthenfollowthesamestepsuntilallfourMPsareupdated.
Front-End Director ConceptsThe Frontend Directorsmanage theaccessand schedulingof I/O requests toand from theattached
servers. TheFEDscontainthehostportsthatinterfacewiththeserversor(whenusingOpenFibreFEDs)
attachtoexternalstorage.TherearefourtypesofFEDfeaturesavailable: 8portOpenFibre,16port
OpenFibre,8portFICON,and8portESCON.Thevarioustypesof interfaceoptionscanbesupported
simultaneouslybymixingFEDfeatureswithintheUniversalStoragePlatformV.
256KB Cache slot (sequential)
256KB Read segment
256KB Write segment
256KB Cache slot (sequential)
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
24/103
RESTRICTEDCONFIDENTIAL Page 24
Table 4. Comparison: Universal Storage Platform V FED Features
FEDOptions Features TotalPorts
USPV USPV
OpenFibre 014 0224
or
ESCON
08
0
64
or
FICON 08 064
RefertotheFigurebelowforthefollowingboardcomponentsdiscussion.
The FED processors (CHPs, more commonly called MPs) manage the host I/O requests, the
Shared Memory and Cache areas, and execute the microcode and the optional software
featuressuchasDynamicProvisioning,UniversalReplicator,VolumeMigrator,andTrueCopy.
The DX4 chips are Fibre Channel encoderdecoder chips that manage the Fibre Channel
protocolonthecablestothehostports.
TheDataAdapter(DTA)chipisaspecialASICforcommunicationwiththeCSWs.
TheMicroprocessorAdapter(MPA)isanASICforcommunicationwiththeSMAPCBs.
AllI/O,whetheritisReadsorWrites,internalorexternalstorage,mustpassthroughtheCache
andSharedMemorysystems.
Figure 13. Example of a Universal Storage Platform V FED Board (16-port Open Fibre Feature)
MPA
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
25/103
RESTRICTEDCONFIDENTIAL Page 25
FED Microcode Updates
TheUniversalStoragePlatformVmicrocodeisupdatedbyflashingeachMPsNVRAMintheFEDPCBs.A
copyofthismicrocodeissavedinthelocalRAMoneachFEDPCB,andeachMPonthePCB(fourMPs)
willperformahotupgrade ina rolling fashion.Whilehostportsare still live,eachMPwill take itself
offline,flashthenewmicrocodefromthe localFEDRAM,andthenreboot itself. Itthenreturnstoan
onlinestate
and
resumes
processing
I/O
requests
and
executing
the
optional
installed
software.
The
typicaltimeforthisprocessis30secondsperMP.ThishappensindependentlyoneachFEDPCBandin
parallel.
FED FC-8 port, ESCON, and FICON: Summary of Features
Table5showstheUniversalStoragePlatformVoptionsforthe8portFibreChannel,8portFICON,and
8portESCONfeatures.Thefrontendportcounts,theFED(CHA)PCBnames,andtheassociatedCSWs
arealsoindicated.
Table 5. 8-port FED Features
FC- 8 por t, FICON, ESCON Features
Pair FED Boards Ports CSW
Feature1 FED1 FED00 FED08 8 0
Feature2 FED2 FED02 FED0A 16 0
Feature3 FED3 FED01 FED09 24 1
Feature4 FED4 FED03 FED0B 32 1
Feature5 FED5 FED04 FED0C 40 2
Feature6 FED6 FED06 FED0E 48 2
Feature7 FED7 FED05 FED0D 56 3
Feature8 FED8 FED07 FED0F 64 3
FED FC-16 port Feature
Table 6 shows theUniversal Storage PlatformVoptions for the 16port FibreChannel features. The
frontendportcounts,theFEDPCBnames,andtheassociatedCSWsarealsoindicated.Thisportcount
canbeincreasedupto224(usingupto14FEDfeatures)ifthenumberofBEDfeaturesisminimal(two)
inordertoallowformoreFEDfeatures(bytaking12BEDslots).ThisFEDexpansionwouldbeusedfor
adding a large amountofexternal storage to theUniversal StoragePlatformV,where less than 256
internaldiskswouldbeconfigured.
Table 6. Universal Storage Platform V FED FC-16 Feature
FC- 16 port Feature
Pair FED Boards Ports CSW
Feature1 FED1 FED00 FED08 16 0
Feature2 FED2 FED02 FED0A 32 0
Feature3 FED3 FED01 FED09 48 1
Feature4 FED4 FED03 FED0B 64 1
Feature5 FED5 FED04 FED0C 80 2
Feature6 FED6 FED06 FED0E 96 2
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
26/103
RESTRICTEDCONFIDENTIAL Page 26
Feature7 FED7 FED05 FED0D 112 3
Feature8 FED8 FED07 FED0F 128 3
Feature9 FED9 (BED3) 144 1
Feature10 FED10 (BED4) 160 1
Feature11 FED11 (BED5) 176 2
Feature
12
FED
12
(BED6)
192
2
Feature13 FED13 (BED7) 208 3
Feature14 FED14 (BED8) 224 3
I/O Request Limits and Queue Depths (Open Fibre)
Every fibre channel path between the host and the storage array has a specificmaximum capacity,
knownastheMaximumI/ORequestLimit.Thisisthelimittotheaggregatenumberofrequestsbeing
directed against the individual LUNs. For theUniversal Storage PlatformV, this limit is 4096, and is
associatedwitheachFEDMP.Onthe16portfeature,wheretherearetwoportsmanagedbyeachMP,
4096 is theaggregate limitacross those twoports. [This ishandleddifferently forFICON,where the
MPslimit is480forOpenExchangemessaging.]However,whenaport(actuallytheassociatedMP)is
placedin
external
storage
mode
(becomes
an
initiator),
the
port
I/O
Request
Limit
drops
to
256.
QueueDepthisthemaximumnumberofoutstandingI/OrequestsperLUN(internalorexternal).Thisis
separatefromtheMPsportI/ORequestLimit.NotethatthereisnoconceptofQueueDepthforESCON
orFICONvolumes.
OntheUniversalStoragePlatformV,theperLUNnominalqueuedepth is32butcanbemuchhigher
(up to4096) in theabsenceofotherworkloadson thatMPsports for internal LUNs. In the caseof
external (Virtualized)LUNs,thequeuedepthcanbesetto2to128 intheUniversalVolumeManager
GUIasshownbelow. IncreasingthedefaultqueuedepthvalueforexternalLUNs from8upto32can
oftenhaveaverypositiveeffect(especiallyResponseTime)onOLTPlikeworkloads.Theactuallimitwill
dependonhowmanyotherconcurrentlyactiveexternalLUNsthereareonthatport.
Figure 14. Storage Navigator Screen Showing External Port Queue Depth Control
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
27/103
RESTRICTEDCONFIDENTIAL Page 27
MP Distributed I/O (Open Fibre)
TheOpenFibreFEDs(notavailableonFICONorESCONFEDs)ontheUniversalStoragePlatformVhavea
newfeatureknownasMPDistributedI/O.SomeoftheworkpertainingtoanindividualI/Oonasingle
portmaybeprocessedbyanotherMPonthesamePCBratherthantheMPthatnormallycontrolsthat
port. [See Appendix 8 for the portMP mappings.] A heavily loaded MP can hand off some tasks
pertaining to data, metadata, software (SI, TC, HUR), and Hitachi Dynamic Provisioning
PooladministrativetaskstoanotherMPthat is lessbusy. MPscanbeconfiguredtobe inoneoffour
modes:target(host),HURA(Send),HURB(receive),andinitiator(external)mode.MPsmustbeinthe
samemodeinordertoparticipateinDistributedI/O.Forexample,twoMPs(onthesameboard)whose
portsareconfiguredininitiator(external)modewillworktogether.
TheDistributedI/OfeatureislimitedtothefourMPsonanindividualPCBboard.Someofthistechnique
wasactuallybegunintheUniversalStoragePlatformwithcode levelsV08andhigher,butmostofitis
unique to the Universal Storage Platform V with its much higher powered MPs and faster Shared
Memory system.Thedegree towhichDistributed I/O isavailable isdependenton thepatternof I/O
requestsfromthehostaswellastheinternalstatusofworkalreadyinthesystem,backendprocessing
ofpendingrequests,andsoforth.
Distributed I/Oonan individualMPbeginswhen itreachesasustainedaverage50%busyrate,and is
managed on a roundrobin basis with the other available MPs on the same FED board, taking into
accounttheircurrentbusystate. Atableiskeptinlocalmemorytoassistinthisworkloaddispatching.
AnMPthatreceivesanI/Ocommandoverahostpathandiscurrentlymorethan50%busywilllookfor
anotherMPonthatboardtoassistwiththeoperation.Thisoffloaddoesnotrepresent loadbalancing
among host ports,just a degree ofparallelprocessing of I/O commands to the backend based on
availablecyclesamongtheMPsonaboard.[Note:WhentwoormoreMPsonaFEDboardareoperated
inexternalmode,wehaveobservedthatinsomecasesthereisnominimumpercentbusythreshold
requiredtotriggertheDistributedI/Omode.]
IfanMPisdoingprocessingforotherMPsandthenreceivesitsownhostI/Ocommand,thisworkload
couldbe
distributed
among
the
other
MPs
on
the
board.
As
the
processors
all
get
busy,
the
degree
of
sharingrapidlydropsoff.MPsthatcurrentlyhavenohostI/Ooftheirowncanworkuptoa100%busy
rateonDistributedI/O loads.InthepresenceofhostI/OonaportassociatedwithanMP,thesumof
thatMPshost I/O+distributed I/Omustbe lessthan50%.Oncehost I/OonanMPexceedsthe50%
busyrate,thereisnofurtheracceptanceofDistributedI/ObythatMP.
When Fibre Channel FED ports are configured for external storage, the associated owningMP no
longer participates in the hostfacing Distributed I/O mode. That MP, and the one or two ports it
manages,isplacedintoinitiatormode.(Note:EveryportonanMPisoperatedinthesamemode.)If
twoormoreMPsonaPCBareusedforexternalattachment,thentheywillengagetheDistributedI/O
function.AtleasttwoMPsmustbeinexternalmode(thusalloftheirports)inordertogetDistributed
I/OforexternalloadsonaFEDboard.
External Storage Mode I/O (Open Fibre)
Whenusing theOpen Fibre FEDs, eachMP (and the one or twoports itowns)on the PCBmay be
configured as one of four modes: a target (accepting host requests), an initiator (driving storage
requests), as TC/HURA (Copyproducts send) or as TC/HURB (Copyproducts receive). The initiator
mode ishowotherstoragearraysareattachedtoFEDportsonaUniversalStoragePlatformV.Front
end ports from the external (secondary) array are attached to some number of Universal Storage
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
28/103
RESTRICTEDCONFIDENTIAL Page 28
PlatformVFEDportsas iftheUniversalStoragePlatformVwasaserver. Ifusingthe16port feature,
thenbothportsmanagedbyanMPareplacedintothismode.Inessence,thatFEDMP(andtheoneor
twoportsitowns)isoperatedasthoughitwasaBEDMP.LUNsvisibleontheexternalarraysportsare
thenremappedbytheUniversalStoragePlatformVouttoFEDportsattachedtohosts.
AsI/OrequestsarriveoverotherhostattachedFEDportsontheUniversalStoragePlatformVforthose
LUNs,the
normal
I/O
operations
within
the
FED
occur.
The
request
is
managed
by
Shared
Memory
tablesandthedatablocksgointothedatacache.ButtherequestisthenroutedtotheFEDMP(nota
BEDMP) thatcontrolstheexternalpathswherethatLUN is located.Theexternalarrayprocessesthe
requestasthoughitweretalkingtoaserverinsteadoftheUniversalStoragePlatformV.
Cache Mode Settings
Theexternalport(andallLUNspresenton it)maybeassignedwitheitherCacheMode=ONorOFF
when it is configured in theUniversalStoragePlatformV.The ON settingprocesses I/OandCache
behavioridenticaltointernalLDEVs.TheOFFsettingdirectstheUniversalStoragePlatformVtowait
forWrite I/Ocompletionuntilthedata isacceptedbytheexternaldevice.TheOFFsettingdoesnot
changeothercachehandlingbehaviorsuchasreadstotheexternalLUNs;howeverthischangemakesa
significantdifference
when
writing
to
slower
external
storage
that
presents
arisk
for
high
write
pending
casestodevelop.TheuseofCacheMode=ONwillreportI/Ocompletiontothehostforwritesonce
theblocksarewrittentoUniversalStoragePlatformVcache.CacheMode=ONshouldnormallynot
beusedwhenhighwriteratesareexpected.Ingeneral,theruleofthumbistouseCacheMode=OFF.
Other External Mode Effects
RecallfromabovethattheQueueDepthforanexternalLUNcanbemodifiedtobebetween2and128
in theUniversalVolumeManagerGUIas shownabove. Increasing thedefaultvalue from8up to32
(probably the best choice overall), 64, or perhaps even 128 can often have a very positive effect
(especially Response Time) on OLTPlike workloads. Along with this, recall that the maximum I/O
RequestLimitforanexternalportis256(not4096).
Whenports
on
aPCB
are
configured
for
use
as
external
attachment,
then
their
owning
MP
is
operated
ininitiatormode;therewillbenoDistributedI/OmodeuntilatleastoneotherMPissoconfigured.
Theexternalportrepresents itselfasaWindowsserver.Therefore,whenconfiguringthehosttypeon
theportsonthevirtualizedstoragearray,thenmustbesettoWindowsmode.
Front-End Director Board DetailsThissectiondiscusses thedetailsof the four typesofFED featuresavailableon theUniversalStorage
PlatformV.
Open Fibre 8-Port FeatureThe8portOpenFibrefeatureconsistsoftwoPCBs,eachwithfour4Gb/secOpenFibreports.EachPCB
hasfour800MHzChannelHostInterfaceProcessors(CHP)andtwoTachyonDX4dualportedchips.The
SharedMemory (MPA)portcount isnoweightpathsperPCB (double theUniversalStoragePlatform
PCB). SeeAppendix5forports,names,andMPassociations.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
29/103
RESTRICTEDCONFIDENTIAL Page 29
Figure 15. Universal Storage Platform V 8-port Fibre Channel Feature (showing both PCBs)
Open Fibre 16-Port Feature
The16portOpenFibre featureconsistsoftwoPCBs,eachwitheight4Gb/secOpenFibreports.Each
PCBhasfour800MHzChannelHostInterfaceProcessors(CHP)andfourTachyonDX4dualportedchips.
ThereareeightSharedMemory(MPA)pathsperPCB(doubletheUniversalStoragePlatformPCB).See
Appendix6forports,names,andMPassociations.
Figure 16. Universal Storage Platform V 16-port Fibre Channel Feature (showing both PCBs)
MPA MPA
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
30/103
RESTRICTEDCONFIDENTIAL Page 30
ESCON 8-port Feature
The8portESCONfeatureconsistsoftwoPCBs,eachwithfour17MB/sESCONFibreports. EachPCBhas
two 800MHzChannelHost Interface Processors (CHP) andone ESA0 interface. There are twoCache
paths and eight Shared Memory (MPA) paths per PCB. See Appendix 7 for ports, names, and MP
associations.
Figure 17. Universal Storage Platform V 8-port ESCON Feature (both PCBs shown)
FICON 8-port Feature
The8portFICONfeatureconsistsoftwoPCBs,eachwithfour4Gb/secFICONports. EachPCBhasfour
800MHzChannelHost InterfaceProcessors (CHP) and twoHTP interface chips. There are two cache
SwitchpathsandeightSharedMemory(MPA)pathsperPCB. SeeAppendix8forports,names,andMP
associations.
Figure 18. Universal Storage Platform V 8-port FICON Feature (both PCBs shown)
400MH
z
400MH
zCHP 400MH
z
400MH
zCHP
ESA0
2 X 1064 MB/sec
DATA Only
8 X 150 MB/sec
Meta-DATA Only
DTADTA MPA
400MH
z
400MH
zCHP 400MH
z
400MH
zCHP
ESA0
2 X 1064 MB/sec
DATA Only
8 X 150 MB/sec
Meta-DATA Only
DTADTA MPA
MPA MPA
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
31/103
RESTRICTEDCONFIDENTIAL Page 31
Back-end Director ConceptsTheBackendDirectorsmanageaccessandschedulingofrequeststoandfromthephysicaldisks. The
BEDsalsomonitorutilizationofthe loops,ParityGroups,processors,andstatusofthePCBs inapair.
TheUniversalStoragePlatformVBEDfeature(2PCBs)hasfourpairsof4Gb/secloopssupportingupto
128disks.EachPCBhasfour800MHzDKPprocessorsandfourDRRRAIDprocessors.
The BED features control the Fibre Channel Loops that interfacewith the internal disks (but not to
virtualizedexternalstorage).Every I/Ooperation to thediskswillpass through theCacheandShared
Memory subsystems.Table7 lists theBED features, loopcountsanddiskcapacities for theUniversal
StoragePlatformV system.Note thatBEDsmustbe installed aspairsofOptions (four PCBs)due to
provide8pairsof loops to support the8diskParityGroups, thesebeingRAID5 (7D+1P)andRAID6
(6D+2P).
Notethat,whilethefirstpairofBEDfeaturescanactuallysupport384disks(the128disksintheControl
FrameareattachedtoBED1andBED2seeFigure21below),HDShaslimitedthefieldconfigurationto
support256disksuntilthesecondpairofBEDfeatureshasbeeninstalled.
Table 7. Universal Storage Platform V BED Features
BEDFeatureCount BackendLoops MaxDisks
(2PCBs) Available Configurable
1BED
2BED 8 256(PMlimit)
3BED
4BED 16 640
5BED
6BED 24 896
7BED
8BED
32
1152
BED Microcode Updates
TheUniversalStoragePlatformVmicrocodeisupdatedbyflashingtheMPsintheBEDPCBs.Acopyof
thismicrocode is saved in the localRAMoneachBEDPCB,andeachMPon thePCB (fourMPs)will
performahotupgrade ina rolling fashion.While thedisk loopsare still live,eachMPwill take itself
offline,flashthenewmicrocodefromthe localBEDRAM,andthenreboot itself. Itthenreturnstoan
online state and resumesprocessing I/O requests to thediskson its loops. The typical time for this
processis30secondsperMP.ThishappensindependentlyoneachBEDPCBandinparallel.
BED Feature Summary
Table8 shows theUniversalStoragePlatformVoptions for theBED feature.The totalbackend loop
counts,thenamesoftheBEDPCBs,andtheassociatedCSWsarealsoindicated.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
32/103
RESTRICTEDCONFIDENTIAL Page 32
Table 8. BED Features
BED Feature
Pair BED Boards Loop PairsCSWUsed
Basic BED1 BED10 BED18 0
BED2 BED12 BED1A 8 0
Feature2 BED3 BED11 BED19 1
BED4 BED13 BED1B 16 1
Feature3 BED5 BED14 BED1C 2
BED6 BED16 BED1E 24 2
Feature4 BED7 BED15 BED1D 3
BED8 BED17 BED1F 32 3
Back-End-Director Board Details
BED DetailsFigure19isahighleveldiagramofthepairofPCBsfortheUniversalStoragePlatformVBEDfeature.
Figure 19. Universal Storage Platform V 8-port BED Feature (both PCBs shown)
Back End RAID level Organization
Figure20isaviewoftheUniversalStoragePlatformVsBEDtodiskorganization. Thisisalsohowthe
Lightning 9900V was organized. These two BED features (4 PCBs) can support 256 disks, with the
exceptionbeingBED1andBED2,whichalsocontrolthe128disksintheDKCcontrolframe(hence384
disksoverall).IttakestwoBEDoptionstosupportalloftheParityGrouptypesduetothesmaller4loop
PCBs.Whilethesecondloopfromtheothercontrollerwouldtakeovertheloadfromthefailedpartner
loop,thesubsequentfailureofthatalternateloopwouldtakedownalloftheRAID10(4D+4D)orRAID
5(7D+1P)ParityGroups.Becauseofthis,pairsofBEDs(4PCBs)mustalwaysbeinstalled.
-
7/25/2019 Hitachi USPV Architecture and Concepts.pdf
33/103
RESTRICTEDCONFIDENTIAL Page 33
Figure 20. Universal Storage Platform V Back-end Disk Layout. (2 BED features)
Universal Storage Platform V: HDU and BED Associations by FrameFigures21and22illustratethelayoutsofthe64diskcontainers(HDU),theFrames,andtheassociated
BEDownershipsoftheUniversalStoragePlatformV.ThenamesoftherangesofArrayGroupsarealso
shown.Therearetwoviewspresented:aregularfrontalviewandaviewofthebackasseenfromthe
back.
Forexample,lookingatFigure21,thebottomofFrameDKUR1showsthefrontoftheHDUwhosedisks
arecontrolledbytheeight4GbitloopsonBED1(pg3,yellow)andBED2(pg4,orange).TheHDUissplit
inhalfbypowerdomains,wherethe32disksonthelefthalf(16onthefrontand16onthebackofthe
Frame)areattachedtoBED2andthe32ontherighthalfgotoBED1.Notethediagonaloffsetsofthe
HDUhalves
from
front
to
rear
of
the
two
associated
16
disk
groups
by
HDU
power
domains.
Figure 21. Front View of the Universal Storage Platform V Frames, HDUs, with BED Ownership
bed8 bed7 bed8 bed7 bed2 bed1 bed4 bed3 bed4 bed3
16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD
pg18 pg17 pg10 pg9 pg2 pg1 pg6 pg5 pg14 pg13
bed8 bed7 bed8 bed7 bed2 bed1 bed4 bed3 bed4 bed3
16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD
pg18 pg17 pg10 pg9 pg2 pg1 pg6 pg5 pg14 pg13
bed6 bed5 bed6 bed5 bed-5 bed-1 bed2 bed1 bed2 bed1
16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD 16 HDD
pg16 pg15 pg8 pg7 bed-6 bed-2 pg4 pg3 pg12 pg11
bed6 bed5 bed6 bed5