scsi mid-layer eric youngdale 2nd annual linux storage management workshop october 2000

24
SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Upload: walter-clarke

Post on 20-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

SCSI Mid-layer

Eric Youngdale

2nd Annual Linux Storage Management WorkshopOctober 2000

Page 2: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Introduction

Main point of this talk:– Historical evolution of Linux SCSI.– Explain state of the art in Linux 2.2.– Discuss changes for 2.4.– Discuss pending changes in the 2.5 kernel.

Page 3: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Block devices and Linux

• Linux has a generic block device layer with which all filesystems will interact.

• SCSI is no different in this regard – it registers itself with the block device layer so it can receive requests.

• SCSI also handles character device requests and ioctls that do not originate in the block device layer.

Page 4: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

What is the “Mid-Layer”?

• Linux SCSI support can be viewed as 3 levels.

• Upper level is device management, such as tape, cdrom, disk, etc.

• Lower level talks to host adapters.

• Middle layer is essentially a traffic cop, handing requests from rest of kernel, and dispatching them to the rest of SCSI.

Page 5: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

State of the art in Linux-2.2

• Error handling handled better for drivers that make use of new error handling code. New error handling code introduced in 2.2.

• Queue management fundamentally unchanged since the Linux 1.x days. “The Code that Time Forgot”. Lots of dinosaurs running around in the code.

• Rest of mid-level largely stagnant.

Page 6: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

What was wrong in 2.2?

• The elevator algorithms in 2.2 allowed requests to grow irregardless of the capabilities of the underlying device.

• All SCSI disks were handled in a single queue.• Disk driver had to split requests that had become

too large. • One set of common logic for verifying requests had

not become too large.

Page 7: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

What was wrong in 2.2 (cont)

• Character device requests not in queue.

• SMP safety was clumsily handled, leading to race conditions and poor performance.

• Poor scalability.

• Many drivers continue to use old error handling code.

Page 8: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Queue handling in 2.2

Disk1Disk1Disk Queue Disk Queue HeadHead

Disk2Disk2

Disk1Disk1

Disk3Disk3

Disk1Disk1

Page 9: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Changes for Linux-2.4

• Block device layer was generalized to support a “request_queue_t” abstract datatype that represents a queue.

• Contains function pointers that drivers can use for managing the size of requests inserted into queues.

• Requests no longer can grow to be too large to be handled at one time.

Page 10: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Changes for 2.4 (cont)

• No longer any need for splitting requests.

• No need for ugly logic to scan a queue for a queueable request.

• SMP locking in mid-layer cleaned up to provide finer granularity.

Page 11: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Changes for 2.4 (cont)

• A SCSI queuing library was created – a set of functions for queue management that are tailored to different sets of requirements.

• SCSI was modified to use a single queue for each physical device.

• Character device requests and ioctls are inserted into the same queue at the tail, and handled the same as other requests.

Page 12: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Queuing library

Maintainability is a problem if multiple instances of code can perform similar function.

__inline static int __scsi_merge_requests_fn(request_queue_t * q,

struct request * req, struct request * next, int use_clustering, int dma_host)

{ /* * Appropriate contents */ }

Page 13: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Queueing Library (Cont).

#define MERGEREQFCT(_FUNCTION, _CLUSTER, _DMA) \static int _FUNCTION(request_queue_t * q, \

struct request * req, \ struct request * next) \

{ \ return __scsi_merge_requests_fn(q, req, next, _CLUSTER,

_DMA); \ } MERGEREQFCT(scsi_merge_requests_fn_, 0, 0)

MERGEREQFCT(scsi_merge_requests_fn_d, 0, 1) MERGEREQFCT(scsi_merge_requests_fn_c, 1, 0) MERGEREQFCT(scsi_merge_requests_fn_dc, 1, 1)

Page 14: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Changes for 2.4 (cont)

• In 2.2, there were separate functions and code paths for initializing SCSI for the case of compiled into kernel and loaded via modules.

• In 2.4, this was cleaned up – redundant code was removed, and the same code is used to initialize for both modules and compiled into kernel.

Page 15: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Upcoming changes for 2.5

• All drivers will be forced to use new error handling code.

• Disk driver will be updated to handle larger number of disks.

• SMP locking will be cleaned up some more to improve scalability.

Page 16: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Old error handling code

• Essentially a bad state machine.• Has tons of SMP problems that are not

easily fixed.• Tries to resolve errors while allowing new

requests to be queued.• Many kernel reliability problems are

because of old error handling problems.• Needs to be discarded in the worst way.

Page 17: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

New error handling code

• The new error handling code has been available since the 2.1.75 kernel.

• To force driver authors to update their drivers, the old error handling code will simply be removed. Drivers that have not been updated will fail to compile.

• Orphaned drivers will be handled on a case-by-case basis.

Page 18: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Further SMP cleanups

• All low-level drivers currently use io_request_lock for SMP safety.

• This lock is also used by all other block devices on the system to protect their queues.

• Plans are in the works to switch the block device layer to use a per-queue lock, thereby isolating SCSI from other devices.

Page 19: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

SMP Cleanups (cont).

• Low-level drivers don’t need to protect queue – they don’t have access to it.

• Each low-level driver should have a separate lock – ideally one per instance of host, but could be a driver-wide lock initially. This should be up to the low-level driver.

Page 20: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

SMP Cleanups (cont)

• Block device layer has a number of arrays, indexed by major/minor:

blksize_size[MAJOR(dev)][MINOR(dev)]

• Access is not protected by any locks.

• Impossible for block drivers to resize without introducing race condition.

Page 21: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Large numbers of disks

• Current disk driver allocates 8 majors, allowing for only 128 disks.

• Plans are in the works to allow disk driver to dynamically allocate major numbers.

• Would support up to about 4000 disks, when major numbers are exhausted.

• Possible to go beyond this by using fewer bits for partitions.

Page 22: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Wish list.

• Implement some SCSI-3 features (larger commands, sense buffers).

• Improve support for shared busses.

• Support target-mode.

• Check module add/remove code for SMP safety, implement locks.

• Improvements related to high-availability.

Page 23: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Conclusions

The major goal of a rewrite of SCSI queuing has been accomplished. A number of architectural problems were resolved at the same time.

There are still some interesting tasks still to be addressed for 2.5.

See http://www.andante.org/scsi.html for more info, and http://www.andante.org/scsi_todo.html for “todo” list.

Page 24: SCSI Mid-layer Eric Youngdale 2nd Annual Linux Storage Management Workshop October 2000

Contacts

Email: [email protected]

Web: http://www.andante.org

The notes for this talk are on the website.