HomeFreeBSD

Switch to using drive-supplied timeouts for the sa(4) driver.

Description

Switch to using drive-supplied timeouts for the sa(4) driver.

Summary:
The sa(4) driver has historically used tape drive timeouts that
were one-size fits all, with compile-time options to adjust a few
of them.

LTO-9 drives (and presumably other tape drives in the future)
implement a tape characterization process that happens the first
time a tape is loaded. The characterization process formats the
tape to account for the temperature and humidity in the environment
it is being used in. The process for LTO-9 tapes can take from 20
minutes (I have observed 17-18 minutes) to 2 hours according to the
documentation.

As a result, LTO-9 drives have significantly longer recommended
load times than previous LTO generations.

To handle this, change the sa(4) driver over to using timeouts
supplied by the tape drive using the timeout descriptors obtained
through the REPORT SUPPORTED OPERATION CODES command. That command
was introduced in SPC-4. IBM tape drives going back to at least
LTO-5 report timeout values. Oracle/Sun/StorageTek tape drives
going back to at least the T10000C report timeout values. HP LTO-5
and newer drives report timeout values. The sa(4) driver only
queries drives that claim to support SPC-4.

This makes the timeout settings automatic and accurate for newer
tape drives.

Also, add loader tunable and sysctl support so that the user can
override individual command type timeouts for all tape drives in
the system, or only for specific drives.

The new global (these affect all tape drives) loader tunables are:

kern.cam.sa.timeout.erase
kern.cam.sa.timeout.load
kern.cam.sa.timeout.locate
kern.cam.sa.timeout.mode_select
kern.cam.sa.timeout.mode_sense
kern.cam.sa.timeout.prevent
kern.cam.sa.timeout.read
kern.cam.sa.timeout.read_position
kern.cam.sa.timeout.read_block_limits
kern.cam.sa.timeout.report_density
kern.cam.sa.timeout.reserve
kern.cam.sa.timeout.rewind
kern.cam.sa.timeout.space
kern.cam.sa.timeout.tur
kern.cam.sa.timeout.write
kern.cam.sa.timeout.write_filemarks

The new per-instance loader tunable / sysctl variables are:

kern.cam.sa.%d.timeout.erase
kern.cam.sa.%d.timeout.load
kern.cam.sa.%d.timeout.locate
kern.cam.sa.%d.timeout.mode_select
kern.cam.sa.%d.timeout.mode_sense
kern.cam.sa.%d.timeout.prevent
kern.cam.sa.%d.timeout.read
kern.cam.sa.%d.timeout.read_position
kern.cam.sa.%d.timeout.read_block_limits
kern.cam.sa.%d.timeout.report_density
kern.cam.sa.%d.timeout.reserve
kern.cam.sa.%d.timeout.rewind
kern.cam.sa.%d.timeout.space
kern.cam.sa.%d.timeout.tur
kern.cam.sa.%d.timeout.write
kern.cam.sa.%d.timeout.write_filemarks

The values are reported and set in units of thousandths of a
second.

share/man/man4/sa.4:
Document the new loader tunables in the sa(4) man page.

sys/cam/scsi/scsi_sa.c:
Add a new timeout_info array to the softc.

Add a default timeouts array, along with descriptions.

Add a new sysctl tree to the softc to handle the timeout
sysctl values.

Add a new function, saloadtotunables(), that will load
the global loader tunables first and then any per-instance
loader tunables second.

Add creation of the new timeout sysctl variables in
sasysctlinit().

Add a new, optional probe state to the sa(4) driver. We
previously didn't do any probing, but now we probe for
timeout descriptors if the drive claims to support SPC-4 or
later. In saregister(), we check the SCSI revision and
either launch the probe state machine, or announce the
device and become ready.

In sastart() and sadone(), add support for the new
SA_STATE_PROBE. If we're probing, we don't go through
saerror(), since that is currently only written to handle
I/O errors in the normal state.

Change every place in the sa(4) driver that fills in
timeout values in a CCB to use the new timeout_info[] array
in the softc.

Add a new saloadtimeouts() routine to parse the returned
timeout descriptors from a completed REPORT SUPPORTED
OPERATION CODES command, and set the values for the
commands we support.

Add comments explaining the priority order of the various
sources of timeout values. Also, explain that the probe
that pulls in drive recommended timeouts via the REPORT
SUPPORTED OPERATION CODES command is in a race with the
thread that creates the sysctl variables. Because of that
race, it is important that the sysctl thread not load any
timeout values from the kernel environment.

Sponsored by: Spectra Logic

Test Plan:
Try this out with a variety of tape drives and make sure the timeouts that
result (sysctl kern.cam.sa to see them) are reasonable.

Reviewers: manpages, cam

Subscribers: imp

Differential Revision: https://reviews.freebsd.org/D33883

(cherry picked from commit 5719b5a1bb643d5622557afe78dca63a800d9b7c)
(cherry picked from commit bcff64c54a74268742f52d40d1eb2acd8ab6f07d)
(cherry picked from commit 6e8a2f04001735353e445570f0d83aa88d4b9b37)

Details

Provenance
kenAuthored on Jan 13 2022, 9:07 PM
Differential Revision
D33883: Switch to using drive-supplied timeouts for the sa(4) driver.
Parents
rG5746abf94eb3: libpfctl: fix pfctl_kill_states()
Branches
Unknown
Tags
Unknown