Page MenuHomeFreeBSD

nvme: Eliminate RECOVERY_FAILED state
ClosedPublic

Authored by imp on Oct 2 2023, 9:20 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 6, 6:09 PM
Unknown Object (File)
Fri, Dec 20, 2:31 PM
Unknown Object (File)
Fri, Dec 20, 6:56 AM
Unknown Object (File)
Dec 9 2024, 5:42 AM
Unknown Object (File)
Nov 24 2024, 2:54 AM
Unknown Object (File)
Nov 23 2024, 2:16 PM
Unknown Object (File)
Nov 21 2024, 6:43 PM
Unknown Object (File)
Nov 18 2024, 7:32 PM
Subscribers

Details

Summary

While it seemed like a good idea to have this state, we can do
everything we wanted with the state by checking ctrlr->is_failed since
that's set before we start failing the qpairs. Add some comments about
racing when we're failing the controller, though in practice I'm not
sure that kind of race could even be lost.

Sponsored by: Netflix

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

imp requested review of this revision.Oct 2 2023, 9:20 PM
imp added a parent revision: D42050: nvme: Remove stale comment.
sys/dev/nvme/nvme_qpair.c
1038

Would it be good hygiene to add
qpair->timer_armed = false;
here or will that already be true?

sys/dev/nvme/nvme_qpair.c
1038

It won't already be true.
In fact, we rely on the timeout to disarm itself now.
I'll add it here for this review...

But now that you point this out, and I got to thinking... This could lead to a crash if we're unloading the driver under heavy load maybe (since we won't be idle).
I'll followup with a separate patch to fix that since we're currently never cancelling the timer (I was sure we were, but a quick grep suggests it was overlooked). I'll have to study to find which of the destroy, destruct, etc methods would be right.

a little bit of hygine: set timer_active to false for failed case.
Not sure that it matters, but it's good practice.

This revision is now accepted and ready to land.Oct 4 2023, 3:35 PM
This revision was automatically updated to reflect the committed changes.