copyright © 2005 emc keystone reserve/release feature toi v1.0 ed chejlava keystone reserve/release...
Post on 31-Dec-2015
226 Views
Preview:
TRANSCRIPT
Copyright © 2005 EMC
Keystone Reserve/Release feature TOIv1.0
Ed Chejlava
Keystone Reserve/Release feature TOIv1.0
Ed Chejlava
8/11/2005Legato Confidential
Legato Systems, Inc - Confidential and Proprietary 2
Introduction
Prerequisites for attending this TOI session
Overview and Benefits of the new feature
Installation considerations
How to configure/enable the feature
Using the feature
Licensing considerations
Architecture and internal Design
Debugging techniques and tips
Questions and Answers
Legato Systems, Inc - Confidential and Proprietary 3
Prerequisites
Familiarity with:• NetWorker and shared devices
• All NW OSes
• SCSI
• CDI and MTIO
Legato Systems, Inc - Confidential and Proprietary 4
Overview and Benefits
Problem to be solved:• Shared devices can be accessed by more than one
host/application at any time, resulting in tape position changes or overwriting of data
• Other backup apps (Netbackup, etc.)• System monitoring tools (such as ?)• Unwary users with root access: mt, inquire, sji*, cdi*• Possible but rare on shared SCSI setups, very common
on FC and iSCSI
Reserve/release is an attempt at preventing these problems by using existing SCSI commands to prevent more than one host/application from accessing any particular drive while it is in use by NetWorker.
Legato Systems, Inc - Confidential and Proprietary 5
Overview and Benefits
Normal NetWorker operation over SAN
Save, save, save, save…… done saving
Sees tape as /dev/nst2
Sees tape as /dev/rmt/2*
Sees tape as \\.\Tape23
Legato Systems, Inc - Confidential and Proprietary 6
Overview and Benefits
Quiet time on the SAN
Legato Systems, Inc - Confidential and Proprietary 7
Overview and Benefits
Other host interferes with NetWorker operation
System monitoring app opens /dev/rmt/2cb (note: no ‘n’ for non-rewinding)
Legato Systems, Inc - Confidential and Proprietary 8
Overview and Benefits
The real problem!
System monitoring app closes /dev/rmt/2cb
Tape is rewound!
Legato Systems, Inc - Confidential and Proprietary 9
Overview and Benefits
NetWorker saves some more…
Verify label, save, save, save, save…
Uh oh…. Tape label and older savesets overwritten!!! Bummer dude.
Legato Systems, Inc - Confidential and Proprietary 10
Overview and Benefits (cont.)
Reserve/release enabled:• Issues a SCSI reserve device
command when NW opens tape drive
• Issues a SCSI release device command when NW closes tape drive
(NOT used for file-type devices)
Legato Systems, Inc - Confidential and Proprietary 11
Overview and Benefits (cont.)
Ideally, we should reserve on mount, release on dismount, but that didn’t work and I didn’t have time to make it work.
• Current implementation will not really help the case as shown in these slides, since NW closes tape device when saveset is done.
• Current code will help for case where “evil” commands are snuck in to tiny gaps between writes.
May be able to get “reserve on mount” method working by RTM….
Legato Systems, Inc - Confidential and Proprietary 12
Overview and Benefits (cont.)
Two basic variants:• “simple”
• Reserve = 0x16, release = 0x17
• Supported by essentially all devices
• Persistent• Persistent Reserve In = 0x5e
• Persistent Reserve Out = 0x5f
• Supported by newer tape drives
Legato Systems, Inc - Confidential and Proprietary 13
Overview and Benefits (cont.)
“simple” reserve/release• Also used by some OS tape drivers:
• Solaris– Can be disabled for device types using st.conf
entries (I.e. for all Quantum DLT7000 drives)
• AIX– Can be disabled for individual devices using
SMIT or command line » chdev –l ‘rmt3’ –a res_support=‘no’
• Some Windows tape drivers (IBM…)– Not known to be configurable
Legato Systems, Inc - Confidential and Proprietary 14
Overview and Benefits (cont.)
“simple” reserve/release (OSes cont.)
• HP-UX– Can be controlled for stape driver via kernel
parameter st_ats_enabled. Set to 0 (zero) to turn off OS reserve/release
• IRIX– Looks like it is configurable (any IRIX experts
out there?)
• Linux– Maybe yes, maybe no, maybe yes and no
Legato Systems, Inc - Confidential and Proprietary 15
Overview and Benefits (cont.)
OS reserve/release must be disabled for NW’s reserve/release to work.
• If OS R/R not turned off, our R/R will fail and we will turn R/R off for the current tape device.
• Current code is not very verbose about this, but does log a message about the attribute change to disabled.
• Detection of OS reserve should be better by RTM
Legato Systems, Inc - Confidential and Proprietary 16
Overview and Benefits (cont.)
System requirements to use feature• CDI OS
• All except IRIX
• Drive supports selected R/R variant• All support “simple”
• Newer support Persistent
• A few support Persistent with APTPL(Activate Persist Through Power Loss)
Legato Systems, Inc - Confidential and Proprietary 17
Overview and Benefits (cont.)
Limited support!!!• Default is “None”
• See Advanced tab in device resource with “View: diagnostic mode” selected.
• Even at RTM, we will only support 2 customers for Reserve/Release – Fidelity (a.k.a. Accenture) and Volvo
• Anyone else having problems with Reserve/release enabled will be told to turn it off.
Legato Systems, Inc - Confidential and Proprietary 18
Overview and Benefits
Where to learn more• SCSI-3 SPC-2 and SPC-3
• /usr/src/mfr… or \\fiona\mfr\…
• ../ANSI/SCSI/SPC-2/ncits_351_2001.pdf
• ../ANSI/SCSI/SPC-3/spc3r21.pdf
• NetWorker 7.3 eRoom:• Dept Folders/Dev/SRS/
– reserve_release_keystone_4004_srs.1.0.doc
– Device_reservation_7_2_4004_efs.doc
Legato Systems, Inc - Confidential and Proprietary 19
Installation Considerations
Changes to installation• None
Legato Systems, Inc - Confidential and Proprietary 20
Configuring the Feature
Configuration• Two new attributes in device resource:
Under “Advanced” tab in “Properties” sheet with “View/Diagnostic mode” checked:
Reserve/Release: None SimplePersistent
Persistent+APTPL
Persistent Reserve Key:
Legato Systems, Inc - Confidential and Proprietary 21
Configuring the Feature
Configuration• Reserve/Release:
• Default value is “None”
• Simple uses “old style” SCSI reserve and release commands
• Persistent uses newer Persistent Reserve Out command with reservation key specified in Persistent Reserve Key attribute
• Persistent+APTPL is same as Persistent except that the Active Persist Through Power Loss bit is set
• Persistent Reserve Key:• Either 8 character text string or 18 character string representing a
64-bit hex value– Default value is “NetWorkr”
– Example of hex: 0x123456789abcdef0
» Text value cannot start with 0x (zero – x)
Legato Systems, Inc - Confidential and Proprietary 22
Configuring the Feature
Automatic fallback• If the selected drive does not support the
variant of the command that you have selected, the value of the Reserve/Release attribute will be changed as needed and a message will be posted to the messages pane and log files
• If CDI is set to “Not used”, Reserve/Release is simply ignored
Legato Systems, Inc - Confidential and Proprietary 23
Configuring the Feature (cont.)
Example from messages and daemon.log 08/10/05 14:37:44 nsrd: media info: Drive /dev/nst0 reserved (persistent key Berferd1 w/APTPL) 08/10/05 14:37:47 nsrd: media info: Drive /dev/nst0 released (persistent w/APTPL) 08/10/05 14:37:52 nsrd: media info: Drive /dev/nst0 reserved (persistent key Berferd1 w/APTPL) 08/10/05 14:37:52 nsrd: media warning: /dev/nst0 reading: Success 08/10/05 14:37:55 nsrd: media info: Drive /dev/nst0 released (persistent w/APTPL) 08/10/05 14:37:56 nsrd: media warning: /dev/nst0 reading: no tape label found 08/10/05 14:37:56 nsrd: /dev/nst0 Label without mount operation in progress 08/10/05 14:37:56 nsrd: media info: LTO Ultrium-2 tape will be over-written 08/10/05 14:38:01 nsrd: media info: Drive /dev/nst0 released (persistent w/APTPL) 08/10/05 14:38:06 nsrd: media info: Drive /dev/nst0 reserved (persistent key Berferd1 w/APTPL) 08/10/05 14:38:08 nsrd: media info: Drive /dev/nst0 released (persistent w/APTPL) 08/10/05 14:38:08 nsrd: /dev/nst0 Eject operation in progress pools supported: Default; 08/10/05 14:38:22 nsrd: media info: Drive /dev/nst0 released (persistent w/APTPL)
(note the odd order…. I _think_ that is just NW’s normal logging oddness but will look closer to make sure that that isn’t really how the commands are issued.)
Legato Systems, Inc - Confidential and Proprietary 24
Using the Feature
Once the Reserve/Release attribute is set, the user does not need to to anything else. All is handled automatically.
If the user wishes to, they can set unique Persistent Reserve Keys to make it a little easier to see which host is doing what to the various drives, since it is set on a per-NW-device basis.
• Set to “Sol-nw-1” on devices accessed via system Sol-nw-1
• Set to “Win-nw-6” on device accessed viasystem Win-nw-6……
Legato Systems, Inc - Confidential and Proprietary 25
Using the Feature
New utility commands (generally not used by the normal user….)
• cdi_reserve and cdi_releaseThese act very much like NetWorker does
during normal operations• Must supply –f <device file name> (as with all cdi
utilities)• Optional –T plus:
– s for Simple – p for Persistent– A for Persistent+APTPL
• Optional –k <persistent reserve key> if using –T p or –T a
• Either persistent type will use Exclusive variantThese are not actually useful during normal operations…
Legato Systems, Inc - Confidential and Proprietary 26
Using the Feature
New utility commands• cdi_pr complicated test program for persistent reservation code.
Useful subcommands:• cdi_pr –f device –r k
• Read existing registered keys for device
• cdi_pr –f device –r r• Read current reservations for device
• cdi_pr –f device –c r –k TestKey• Register key TestKey with device
• cdi_pr –f device –c E –k TestKey• Reserve device for exclusive access using previously registered key
TestKey
Note that these cdi utilities do not have the same key format restrictions as NW attributes – input is either space padded or truncated as required
Legato Systems, Inc - Confidential and Proprietary 27
Using the Feature
New utility commands.
Useful subcommands continued:
• cdi_pr –f device –Q• This one tries to figure out what if any Persistent Reserve
commands device understands.
• Attempts to do Persistent Reserve In to read any registered keys. If this fails, the drive does not support Persistent reserve
• If readkeys works, tries a Register/Ignore with APTPL bit set
• If Register+APTPL fails, tries Register/Ignore w/o APTPL, which should work
• On the way out clears any registered keys using the “clear” command
Legato Systems, Inc - Confidential and Proprietary 28
Using the Feature (cont.)
Changing reserve/release on one NetWorker device that is using a shared drive should change it for all other devices using the same tape drive (assuming that they were configured correctly with the same “Hardware ID”)
All hosts trying to access a given drive should use the same method (I.e. simple, persistent…)
Legato Systems, Inc - Confidential and Proprietary 29
Using the Feature - limitations
Simple reserve:• Some OS utilities can break reservations
• Requires semi-serious OS magic – you must cause a SCSI bus reset or issue a Bus Device Reset SCSI message.
• Solaris’ mt forecereserve
• Others??
• Power-cycle drive will clear also
Legato Systems, Inc - Confidential and Proprietary 30
Using the Feature - limitations
Persistent• No current OS utilities can break persistent reservations,
but I’m sure that it will happen.
• Power cycle will not clear persistent reservation if it was made using the APTPL bit… (hence the name – Active Persist Through Power Loss…..)
• Easily broken by user-level code – even though there is a 64-bit “key”:
• Ask the drive for any keys and reservations that it has (cdi_pr –r k and cdi_pr –r r)
• Select “old key” that holds the reservation
• Register your own key (cdi_pr –c r –k <new key>)
• Reserve with abort (cdi_pr –c a –k <old key> -K <new key> -T E
Legato Systems, Inc - Confidential and Proprietary 31
Using the Feature – so what good is it?
Even though neither reservation method will stop anyone who is determined to access a tape drive you are using, either method (simple or persistent) will prevent errant applications on other hosts from inadvertently messing with our tape drives.
Best solution is to use the zoning capabilities of current fibrechannel switches to isolate tape drives from all hosts other than those who are supposed to use them, but we have some customers who won’t do this and who think Reserve/release will solve all of their problems.
Legato Systems, Inc - Confidential and Proprietary 32
Licensing Considerations
This feature is not licensed
Legato Systems, Inc - Confidential and Proprietary 33
Questions and Answers
Any questions that have not been answered yet?
Me, boring a penguin on South Georgia Island
Legato Systems, Inc - Confidential and Proprietary 34
Debugging Techniques and Tips
Dig through daemon.log to make sure that for each reserve there is a release (unless of course we are currently using the tape drive in question).
Use cdi_pr –f <device> -r k and cdi_pr –f <device> -r r to see who is registered and who has reserved the drive (if using persistent)
Legato Systems, Inc - Confidential and Proprietary 35
Questions and Answers
Any questions that have not been answered yet?
Thanks for attending
Keystone Reserve/Release
feature TOI
top related