4k sectors and the impact on storage systems
TRANSCRIPT
4K Sectors and the Impact on Storage SystemsBret S. WeberFellow, Engenio Storage GroupLSI Corporation
4K Sectors and the Impact on Storage Systems
• Host Impact• Cache• Mixing Sector Sizes• Protection Information Model (DIF)• Object Storage
Agenda
Storage System Landscape
External Storage External Storage Internal StorageInternal Storage
RAID, Drivers & Utilities Adapters
Chip Down
Enclosure
RAID ControllerDrive CRU
Host Impacts
External RAID Controller Device Mappings
RAID ControllerHosts
Drives. . .
I/O I/O
HostSCSI Domain
DriveSCSI Domain
DataCache
Device 0Lun 0
Device 1Lun 0
Device NLun 0
Device XLun 0,1,2
Array Cache Isolates the Device Sector Size
External RAID Mappings 512 Byte
Sectors
4K Byte
Sectors
520 Byte
Sectors
Drive Group 1
Drive Group 2
Drive Group 3
Flexible LB
Device
Flexible LB
Device
Flexible LB
Device
External RAID Impact
• Sector Size does not have to be linked to Drive• Buffering Decouples the Drive and Host
• LB Size could be held constant to Server• Could be made 512 or 4K on an Array Basis• This adds significant complexity to the Array
• Each LUN on a Device may have a different LB Size• Potential to tie the block size to LUN• This is the Preferred Method• LB would be tied to the Native sector size• Restrictions on Volume Group Mappings
Methodology Depends on Host Capabilities
Internal RAID Controller Device Mappings
Device 0Lun 0
Device 1Lun 0
Device NLun 0
Device XDevice YDevice Z
Cut Through Path
Processor
Chipset
System Memory
Device Driver
PCI RAID Mappings
Cut Through Path
512 Byte
Sectors
4K Byte
Sectors
520 Byte
Sectors
Drive Group 1
Drive Group 2
Drive Group 3
512 Byte LB
Device
4K Byte LB
Device
520 Byte LB
Device
Internal RAID Impact
• Host LB Size needs to be linked to Sector Size• This is due to the direct coupled data transfers
• Each Device will have a different LB Size• Host LB Size will be tied to Sector Size• Restrictions on Volume Group Mappings
• RAID Drive groups must contain drives with all of the same sector size
RAID Controllers Prefer to Tie Host LB size to Sector Size
Protection Information Model(T10 DIF)
DIF Today – Host 512, Drive 512
512 Bytes 8
Host LBA n (Type 1 – LBA Least Significant 4 Bytes and Auto Increment)
Drive LBA a (Type 2 - 32 Byte Command with n+0 specified Initial Tag and Auto Increment)
512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8
512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8
n+0 n+1 n+2 n+3 n+4 n+5 n+6 n+7
n+0 n+1 n+2 n+3 n+4 n+5 n+6 n+7
Host to Array Data Stream
Array to Drive Data Stream
Host – 4K, Drive 4K
8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8
n+0 n+1 n+2
4K Bytes 8 4K Bytes 8 4K Bytes 8
8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 84K Bytes 8 4K Bytes 8 4K Bytes 8
n+0 n+1 n+2
Host LBA n (Type 1 – LBA Least Significant 4 Bytes and Auto Increment)
Drive LBA a (Type 2 - 32 Byte Command with n+0 specified Initial Tag and Auto Increment)
Issues: • Should work similar to 512 bytes with no issues
Host to Array Data Stream
Array to Drive Data Stream
DIF Issues
• Array Controllers typically support 512 byte Logical Block Sizes• As reported in a SCSI Read Capacity Command
• Array Controllers will probably have to support 4K eventually• Will there be a time when the Array Controller will need to
support both 512 bytes and 4K native Logical Block Sizes?• If so, Then this brings up the need for DIF to support the
following combinations• Host 512 bytes – Drive 512 bytes• Host 512 bytes – Drive 4K bytes• Host 4K bytes – Drive 512 bytes• Host 4K bytes – Drive 4K bytes
• Currently DIF is only defined for 8 bytes appended to the SCSI logical block size
DIF Issues
• Mixing sizes currently brings up many issues• Repeated Ref Tags• Multi Block vs Single Block transfers (n+1 or n+8)• Is Host LBA = LBA or LBA*8• Existing Silicon Compatibility
Host – 512, Drive 4K
512 Bytes 8
4K Bytes 8 4K Bytes 8 4K Bytes 8
512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8
Host LBA nn+0 n+1 n+2 n+3 n+4 n+5 n+6 n+7
Drive LBA a
Issues: • 64 bytes need to map to 8• Controller needs to do RMW on small block accesses
Host to Array Data Stream
Array to Drive Data Stream
Host 4K – Drive 512
512 Bytes 8
4K Bytes 8 4K Bytes 8 4K Bytes 8
Host LBA n
512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8 512 Bytes 8
n+0 n+1 n+2
Drive LBA a
Issues: • 8 bytes need to map to 64
Host to Array Data Stream
Array to Drive Data Stream
Potential T10 DIF Modification
• Modify T10 DIF so that the 8 bytes are always appended every 512 bytes, regardless of SCSI Logical Block size
• Investigate impact to current T10 Dif standard
• Investigate impact on current 512 byte implementations
• Investigate impact on current silicon
4K DIF Status
• Need to be Proactive with potential DIF issues• Working on a Proposal to take to T10• Currently looking at using DIF on 512 Byte
boundary independent of block or sector size• Compatibility with current DIF when using 512
byte sectors
Necessary if Host Requires LB Translations
Object Storage Impacts
Object Storage Impact
File SystemUser Component
File SystemStorage Component
Applications
System Call Interface
Storage Device
Sector/LBA Interface
Block I/O Manager
OSD Interface
Storage Device
Block I/O Manager
File SystemStorage Component
CPU
Applications
File SystemUser Component
System Call Interface
CPU
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Object Storage Impact
File SystemUser Component
File SystemStorage Component
Applications
System Call Interface
Storage Device
Sector/LBA Interface
Block I/O Manager
OSD Interface
Storage Device
Block I/O Manager
File SystemStorage Component
CPUApplications
File SystemUser Component
System Call Interface
CPU
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
• If you are doing an object controller with 4K sector drives you will be mapping objects to a file system.
• The file system will support the 4K to the disk drive
Object Storage Impact
File SystemUser Component
File SystemStorage Component
Applications
System Call Interface
Storage Device
Sector/LBA Interface
Block I/O Manager
OSD Interface
Storage Device
Block I/O Manager
File SystemStorage Component
CPUApplications
File SystemUser Component
System Call Interface
CPU
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
• If you are doing objects at the drive level, then the drive will take care of it
Object Storage Impact
File SystemUser Component
File SystemStorage Component
Applications
System Call Interface
Storage Device
Sector/LBA Interface
Block I/O Manager
OSD Interface
Storage Device
Block I/O Manager
File SystemStorage Component
CPUApplications
File SystemUser Component
System Call Interface
CPU
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
Storage Device
Block I/O Manager
File SystemStorage Component
• If you are moving to objects in your host, then you have a new object driver in your host that will handle it so you will never see it
Conclusions
• Host Systems will have to deal with 4K sectors for the DAS and PCI RAID Case
• Host Systems could be buffered from the impact with External RAID, but they will be making the changes anyway• What is the timeframe of the overlap window?
• Array Controllers will require changes to use 4K Sector Disk Drives
• The Preferred method is to tie Host LB size to Sector Size
• Schemes like DIF may need to change to accommodate 4K sectors
Thank You
Questions ???