Page MenuHomeFreeBSD

witness: exit witness_checkorder() if rebooting
AbandonedPublic

Authored by mhorne on Feb 17 2023, 7:55 PM.
Tags
None
Referenced Files
F106703895: D38656.diff
Sat, Jan 4, 2:52 AM
Unknown Object (File)
Mon, Dec 30, 5:42 AM
Unknown Object (File)
Nov 27 2024, 9:15 AM
Unknown Object (File)
Nov 27 2024, 9:15 AM
Unknown Object (File)
Nov 27 2024, 8:52 AM
Unknown Object (File)
Nov 17 2024, 3:14 PM
Unknown Object (File)
Nov 14 2024, 8:28 AM
Unknown Object (File)
Nov 14 2024, 5:51 AM
Subscribers
None

Details

Reviewers
jhb
markj
bz
imp
Summary

witness_checkorder() contains an assertion against acquiring a
blockable/sleepable lock within a critical section. This assertion is
not triggered if kdb_active is true. The function also returns early
when KERNEL_PANICKED() is true.

The kern_reboot() function will set kdb_active = 0 as part of the
shutdown sequence. After that it will acquire the eventhandler_mutex to
run the shutdown handlers.

When invoking the kernel debugger via console alt-break sequence, we
call kdb_enter() from an interrupt filter context (critical section). If
the "reset" command is issued in ddb, this leaves a very small window
where:

  1. The kernel has not panicked.
  2. kdb_active is false.
  3. We appear to be running in a critical section.

This triggers the assertion in witness_checkorder() when we try to
acquire eventhandler_mutex. To fix this we can also check against
'rebooting' variable, which is set to true in the shutdown path right
before kdb_active is adjusted.

This was made visible by my recent change to the "reset" command in
5644850620ae.

Reported by: bz

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 49923
Build 46815: arc lint + arc unit

Event Timeline

mhorne created this revision.
sys/kern/subr_witness.c
1128

There are multiple ways to solve the reported issue, this one being the minimal change required. A more complete solution might be to upgrade the panic checks in this file to SCHEDULER_STOPPED(), as I have done with lockmgr in D38655. I am less certain of the implications of that, however.

The kdb_active check below suggests that we expect to execute some portion of the witness code while kdb is active. I cannot say why that would be useful exactly.

sys/kern/subr_witness.c
1128

If I understand correctly, this could hide witness reports raised during regular shutdown/reboot, which would make it harder to debug shutdown deadlocks. Is that right?

rebooting has too wide a scope. Instead, switch the !kdb_active check to !SCHEDULER_STOPPED(). For posterity, include sys/kassert.h explicity as we already rely on it.

mhorne added inline comments.
sys/kern/subr_witness.c
1128

Absolutely right, thanks.

This revision is now accepted and ready to land.Feb 21 2023, 7:02 PM
mhorne marked an inline comment as done.

Let's do D42684 instead.