Page MenuHomeFreeBSD

LinuxKPI: 802.11: change teardown order to avoid iwlwifi firmware crashes
ClosedPublic

Authored by bz on May 22 2024, 2:31 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Nov 14, 1:30 PM
Unknown Object (File)
Thu, Nov 14, 7:02 AM
Unknown Object (File)
Wed, Nov 13, 7:22 PM
Unknown Object (File)
Thu, Nov 7, 6:34 AM
Unknown Object (File)
Wed, Nov 6, 4:47 PM
Unknown Object (File)
Sun, Nov 3, 8:40 PM
Unknown Object (File)
Thu, Oct 24, 2:02 PM
Unknown Object (File)
Mon, Oct 21, 12:35 PM

Details

Summary

While the previous order worked well for iwlwifi 22000 and later chipsets
(AXxxx, BE200), earlier chipsets had trouble and ran into firmware crashes.
Try changing the teardown order to avoid these problems. The inline
comments in lkpi_sta_run_to_init() (and lkpi_disassoc()) try to document
the new order and also the old problems we were seeing (too early sta
removal or silent non-removal) leading to follow-up problems.

Sponsored by: The FreeBSD Foundation
MFC after: 3 days
PR: 275255

Test Plan

Tested on AX200 and 8265.

The review is mostly opened for people from the PR to test

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 57830
Build 54718: arc lint + arc unit

Event Timeline

bz requested review of this revision.May 22 2024, 2:31 AM

There may be another code path which can still trigger this (or another) problem leading to a FW crash with the old 8xxx/9xxx cards (leading to follow-up KASSERT triggers on GENERIC).

I'll have a look tomorrow; in case you see anything but

iwlwifi0: lkpi_sta_scan_to_auth:1310: mo_sta_state(NONE) failed: -5
iwlwifi0: lkpi_iv_newstate: error -1 during state transition 5 (RUN) -> 2 (AUTH)

after a firmware crash, please follow up here or on the PR and let me know.

In D45293#1033418, @bz wrote:

There may be another code path which can still trigger this (or another) problem leading to a FW crash with the old 8xxx/9xxx cards (leading to follow-up KASSERT triggers on GENERIC).

I'll have a look tomorrow; in case you see anything but

iwlwifi0: lkpi_sta_scan_to_auth:1310: mo_sta_state(NONE) failed: -5
iwlwifi0: lkpi_iv_newstate: error -1 during state transition 5 (RUN) -> 2 (AUTH)

after a firmware crash, please follow up here or on the PR and let me know.

Doh! https://reviews.freebsd.org/D43967 never made it; the description needs updating etc. but the change should go into main as well; it's likely I hit that race. Grrr decade old net80211 problems everyone ignored.

Taerdown -> teardown in commit message.

bz retitled this revision from LinuxKPI: 802.11: change taerdown order to avoid iwlwifi firmware crashes to LinuxKPI: 802.11: change teardown order to avoid iwlwifi firmware crashes.May 22 2024, 4:05 AM
In D45293#1033443, @imp wrote:

Taerdown -> teardown in commit message.

Thanks! With tears in my eyes at 6AM ... and another good report that this helps on the PR already :)

In D45293#1033438, @bz wrote:
In D45293#1033418, @bz wrote:

There may be another code path which can still trigger this (or another) problem leading to a FW crash with the old 8xxx/9xxx cards (leading to follow-up KASSERT triggers on GENERIC).

I'll have a look tomorrow; in case you see anything but

iwlwifi0: lkpi_sta_scan_to_auth:1310: mo_sta_state(NONE) failed: -5
iwlwifi0: lkpi_iv_newstate: error -1 during state transition 5 (RUN) -> 2 (AUTH)

after a firmware crash, please follow up here or on the PR and let me know.

Doh! https://reviews.freebsd.org/D43967 never made it; the description needs updating etc. but the change should go into main as well; it's likely I hit that race. Grrr decade old net80211 problems everyone ignored.

The still observed problem seems to occur if we go from run to auth (not doing a full restart) and we change the channel context (i.e. chan 6 -> chan 1) while going through (run -> scan -> auth). I still wonder if we changed the station (ni) as well in that case but the more debugging I add the harder this gets to trigger. At least I can switch from sta to chanctx debugging now.

This revision was not accepted when it landed; it landed in state Needs Review.May 22 2024, 9:08 PM
This revision was automatically updated to reflect the committed changes.