Use cam_periph_invalidate() instead of just setting the PACK_INVALID
flag in the da softc. It's a more appropriate and bigger hammer for this
case. PACK_INVALID is set as part of that, so remove the now-redundant
setting, but use it to only complain the first time we hit the
condition. This also has the side effect of short-circuiting errors for
other I/O still in the drive which is just about to fail (sometimes with
different error codes than what triggered this ENXIO).
The prior practice of just setting the PACK_INVALID flag, however, was
too ephemeral to be effective.. Since daopen would clear PACK_INVALID
after a successful open, we'd have to rediscover the error (which takes
tens of seconds) for every different geom tasting the drive. These two
factors lead to a watchdog before we could get through all the devices
if we had multiple failed drives with this syndrome. By invalidating the
periph, we fail fast enough to reboot enough to start petting the
watchdog. If we disable the watchdog, the tasting eventually completes,
but takes over an hour which is too long. As it is, it takes an extra
minute per failed drive, which is tolerable.
While cam_periph_error's asc/ascq tables have a SSQ_LOST flag for this
situation, that flag also fails the other periph drivers, like pass,
attached to the device. The docs for these codes are too sparse to help
decide what to do. Err on the side of caution and only invalidate the da
device. Simple commands to collect logs for the vendor still work w/o
hangling the system or other adverse effects. Therefore, I've not added
SSQ_LOST to the asc/ascq table entries for the newer codes.
We can also simplify the logic w/o bloating the change, so do that too.
Sponsored by: Netflix