Page MenuHomeFreeBSD

cam/nda: Remove impossible CAM codes
ClosedPublic

Authored by imp on Jul 19 2023, 4:15 AM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Oct 21, 3:47 PM
Unknown Object (File)
Mon, Oct 21, 3:47 PM
Unknown Object (File)
Mon, Oct 21, 3:46 PM
Unknown Object (File)
Mon, Oct 21, 3:24 PM
Unknown Object (File)
Oct 5 2024, 11:54 AM
Unknown Object (File)
Oct 5 2024, 10:53 AM
Unknown Object (File)
Oct 2 2024, 11:25 AM
Unknown Object (File)
Oct 2 2024, 8:31 AM
Subscribers

Details

Summary

The NVME SIM does not generate these status values, so remove them.

Sponsored by: Netflix

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 52709
Build 49600: arc lint + arc unit

Event Timeline

imp requested review of this revision.Jul 19 2023, 4:15 AM

CAM_ATA_STATUS_ERROR definitely looks weird there. But generally I am not sure why some statuses are counted towards errors and others are not.

nvmf doesn't generate any of these either so far, just CAM_REQUEUE_REQ, CAM_REQ_CMP, CAM_REQ_CMP_ERR, and CAM_REQ_INVALID

This revision is now accepted and ready to land.Jul 19 2023, 5:59 PM
In D41084#935763, @jhb wrote:

nvmf doesn't generate any of these either so far, just CAM_REQUEUE_REQ, CAM_REQ_CMP, CAM_REQ_CMP_ERR, and CAM_REQ_INVALID

It may not generate them now, but would they be completely meaningless there?

In D41084#935763, @jhb wrote:

nvmf doesn't generate any of these either so far, just CAM_REQUEUE_REQ, CAM_REQ_CMP, CAM_REQ_CMP_ERR, and CAM_REQ_INVALID

nvme_sim never gereated these errors, and they were copied from other sim (ata?) when I did the original port.

nvmf could generate the new CAM_NVME_STATUS_ERROR though, right?

In D41084#935770, @imp wrote:
In D41084#935763, @jhb wrote:

nvmf doesn't generate any of these either so far, just CAM_REQUEUE_REQ, CAM_REQ_CMP, CAM_REQ_CMP_ERR, and CAM_REQ_INVALID

nvme_sim never gereated these errors, and they were copied from other sim (ata?) when I did the original port.

nvmf could generate the new CAM_NVME_STATUS_ERROR though, right?

Yes, it will switch to using that once I rebase on top of that.

In D41084#935768, @mav wrote:
In D41084#935763, @jhb wrote:

nvmf doesn't generate any of these either so far, just CAM_REQUEUE_REQ, CAM_REQ_CMP, CAM_REQ_CMP_ERR, and CAM_REQ_INVALID

It may not generate them now, but would they be completely meaningless there?

The only one that might conceivably matter is CAM_REQ_ABORTED. Right now I punt on XPT_ABORT in nvmf and instead just have requests hang forever until either the controller is destroyed entirely, or a connection is reestablished and hung commands resume. iSCSI has a knob to permit failing commands on disconnect instead (useful for multipath situations) and I'm not sure if in that case it might make sense to use CAM_REQ_ABORTED if nvmf supports that kind of knob in the future. That is the only one I can think of though.

It is true that with NVMe/TCP you can in theory get partial data transfers due to a checksum failure in a PDU in the middle of a transfer, but those would be reported as partial success (which would need a new resid member in nvmeio). Right now they are just reported as an error via CAM_REQ_CMP_ERR so that the entire request is retried from the start.

In D41084#935778, @jhb wrote:

It is true that with NVMe/TCP you can in theory get partial data transfers due to a checksum failure in a PDU in the middle of a transfer, but those would be reported as partial success (which would need a new resid member in nvmeio). Right now they are just reported as an error via CAM_REQ_CMP_ERR so that the entire request is retried from the start.

In SCSI resid reports correctly short response from target, like a log or mode page read response where data buffer is shorter than requested (initiator may not know the right size, or it may change). Checksum failure should probably still be a failure.

In D41084#935827, @mav wrote:
In D41084#935778, @jhb wrote:

It is true that with NVMe/TCP you can in theory get partial data transfers due to a checksum failure in a PDU in the middle of a transfer, but those would be reported as partial success (which would need a new resid member in nvmeio). Right now they are just reported as an error via CAM_REQ_CMP_ERR so that the entire request is retried from the start.

In SCSI resid reports correctly short response from target, like a log or mode page read response where data buffer is shorter than requested (initiator may not know the right size, or it may change). Checksum failure should probably still be a failure.

In this case the issue is you might have a request that spans multiple data PDUs and get a checksum error in a later PDU. In that case you get a partial completion where the data from the earlier PDUs was received and copied into the request's data buffer, but remaining data for the transfer was discarded (and it's discarded at a PDU boundary). This does mean it's really a partial completion in terms of the effects on the data buffer used in the CCB or bio.

This revision was automatically updated to reflect the committed changes.