In tracking down lifecycle issues with da and ada, I noticed we'd get
CCBs of the wrong priority for the state of the probe state machine. In
debugging 6c8ab086fed3, I created this patch, but didn't upstream. I
had thought that it was the only cause of the bad ccbs, but I was
mistaken. We still see this message about twice a month in Netflix's
fleet, though the root cause of 6c8ab086fed3 is now gone (despite
the uncertainty expressed in the log: 1-2 a week before, now 0 in
two years).
One cause can be the dynamic I/O scheduler when we're rate limiting
I/O. We'll call the start routine when a timer expires, but that will
interfere with the state machine.
Another cause of this may be related to the I/O coming in too quickly
while we're recovering the device after a different device fails on
mpr/mps.
So to fail safe, since we have to carefully single-step the queue when
we're running the state machine, only accept CCBs that are at priority
CAM_PRIORITY_DEV when we're doing that. Only accept CCBs at priority
CAM_PRIORITY_NORMAL. I/O that would normally be scheduled is now
deferred (it picks back up again when we enter the normal mode).
Also add a whiny message on the off chance ohters were seeing this
problem to gague the priority of a fix for the underlying issue.
nda has no real discovery state machine that re-runs after I/O
processing starts, so no workaround is needed there.
Sponsored by: Netflix