Page MenuHomeFreeBSD

nvme: report failure from bus_bind_intr
Needs ReviewPublic

Authored by emaste on Aug 15 2023, 1:56 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Oct 27, 8:41 PM
Unknown Object (File)
Sep 30 2024, 1:15 PM
Unknown Object (File)
Sep 16 2024, 3:21 PM
Unknown Object (File)
Sep 11 2024, 10:22 AM
Unknown Object (File)
Sep 10 2024, 8:03 PM
Unknown Object (File)
Sep 8 2024, 1:32 AM
Unknown Object (File)
Sep 7 2024, 11:48 AM
Unknown Object (File)
Aug 17 2024, 7:09 AM
Subscribers

Details

Reviewers
imp
Summary

Also give up binding the remaining qpair interrupts.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

I'm planning to add diagnostics lower down, in D41455, so this may not be necessary, but it might provide more insight.

sys/dev/nvme/nvme_ctrlr.c
219

Why give up on the rest?

sys/dev/nvme/nvme_ctrlr.c
219

My thinking is that if this fails it's likely going to be because of an issue like trying to bind to APIC IDs >= 256, and rather than end up with some re-assigned and some not we can just leave the round-robin attempt provided (and also save a bunch of console spam).

Alternately maybe just:

diff --git a/sys/dev/nvme/nvme_ctrlr.c b/sys/dev/nvme/nvme_ctrlr.c
index 96a3c78fe70a..00b92c7239ee 100644
--- a/sys/dev/nvme/nvme_ctrlr.c
+++ b/sys/dev/nvme/nvme_ctrlr.c
@@ -207,8 +207,15 @@ nvme_ctrlr_construct_io_qpairs(struct nvme_controller *ctrlr)
                 * Do not bother binding interrupts if we only have one I/O
                 *  interrupt thread for this controller.
                 */
-               if (ctrlr->num_io_queues > 1)
-                       bus_bind_intr(ctrlr->dev, qpair->res, qpair->cpu);
+               if (ctrlr->num_io_queues > 1) {
+                       error = bus_bind_intr(ctrlr->dev, qpair->res,
+                           qpair->cpu);
+                       if (error) {
+                               nvme_printf(ctrlr,
+                                   "bus_bind_intr to CPU %u failed (%d)\n",
+                                   qpair->cpu, error);
+                       }
+               }
        }
 
        return (0);

Alternately maybe just:

Actually, if that's all we'd want to do I can probably just abandon this (and log message(s) from D41455 could be sufficient).

@imp let me know if you think this patch is useful or if we should just drop it.

I discussed a somewhat similar case with @olce.freebsd_certner.fr (where an upper level might be able to provide a more informative/direct error, but there is also already an error printed at a lower level, that doesn't have the same context).

I'm agnostic on error messages, so long as the user can take action.

Still unsure on the stopping the allocation. Wouldn't that produce some queues without an MSI to generate an interrupt?

Wouldn't that produce some queues without an MSI to generate an interrupt?

No, they've all already been assigned *some* interrupt at this point.