Page MenuHomeFreeBSD

sound: Fix hot-unload page fault
AbandonedPublic

Authored by christos on Sep 29 2024, 2:47 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 4, 2:24 AM
Unknown Object (File)
Thu, Oct 24, 5:55 AM
Unknown Object (File)
Oct 18 2024, 11:46 PM
Unknown Object (File)
Oct 5 2024, 8:34 AM
Unknown Object (File)
Oct 4 2024, 11:58 AM
Unknown Object (File)
Oct 4 2024, 12:52 AM
Unknown Object (File)
Sep 30 2024, 4:42 PM
Subscribers

Details

Summary

This panic is easily reproduced by spawning many channels, and unloading
the module right after. The panic messages usually state that we take a
page fault in destroy_devl() with the cdev lock held. My theory is that
we go to sleep somewhere in destroy_devl(), and by the time we wake up,
pcm_killchans() will have freed resources that destroy_devl() will try
to access, resulting in a page fault.

Sponsored by: The FreeBSD Foundation
MFC after: 2 days

Test Plan

Reproducer:

# kldload snd_dummy # or an actual sound card
# sysctl hw.snd.maxautovchans=50 # even the default can trigger it
# for i in $(seq 1 50); do cat /dev/random >/dev/dsp & done
# kldunload snd_dummy
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex cdev (cdev) r = 0 (0xffffffff8301dd00) locked @ /mnt/src/sys/kern/kern_conf.c:72
stack backtrace:
#0 0xffffffff811c97bf at witness_debugger+0x13f
#1 0xffffffff811cb344 at witness_warn+0x674
#2 0xffffffff81a212c6 at trap_pfault+0x116
#3 0xffffffff81a2023c at trap+0x54c
#4 0xffffffff819dc7d8 at calltrap+0x8
#5 0xffffffff8101d97c at destroy_dev_tq+0x1bc
#6 0xffffffff811b256f at taskqueue_run_locked+0x1ff
#7 0xffffffff811b4258 at taskqueue_thread_loop+0x138
#8 0xffffffff81060e93 at fork_exit+0xa3
#9 0xffffffff819dd83e at fork_trampoline+0xe


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xffffffff841e99b8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff8101a40b
stack pointer           = 0x28:0xfffffe0046a16c20
frame pointer           = 0x28:0xfffffe0046a16d00
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (thread taskq)
rdi: ffffffff841e99b8 rsi: 0000000000000008 rdx: ffffffff81fbbbe0
rcx: 0000000000000000  r8: 0000000000000000  r9: 0000000000000001
rax: 0000003ff083d337 rbx: fffffe004a539c00 rbp: fffffe0046a16d00
r10: 0000000000000000 r11: 0000000000000002 r12: ffffffff841e99b8
r13: fffffe004a539e50 r14: fffffe004a539c60 r15: fffffe004a539c68
trap number             = 12
panic: page fault
cpuid = 1
time = 1727629918
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame 0xfffffe0046a16390
kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfffffe0046a164f0
vpanic() at vpanic+0x226/frame 0xfffffe0046a16690
panic() at panic+0xb5/frame 0xfffffe0046a16750
trap_fatal() at trap_fatal+0x65b/frame 0xfffffe0046a16850
trap_pfault() at trap_pfault+0x12b/frame 0xfffffe0046a16970
trap() at trap+0x54c/frame 0xfffffe0046a16b50
calltrap() at calltrap+0x8/frame 0xfffffe0046a16b50
--- trap 0xc, rip = 0xffffffff8101a40b, rsp = 0xfffffe0046a16c20, rbp = 0xfffffe0046a16d00 ---
destroy_devl() at destroy_devl+0x6eb/frame 0xfffffe0046a16d00
destroy_dev_tq() at destroy_dev_tq+0x1bc/frame 0xfffffe0046a16d50
taskqueue_run_locked() at taskqueue_run_locked+0x1ff/frame 0xfffffe0046a16eb0
taskqueue_thread_loop() at taskqueue_thread_loop+0x138/frame 0xfffffe0046a16ef0
fork_exit() at fork_exit+0xa3/frame 0xfffffe0046a16f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0046a16f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100014 ]
Stopped at      kdb_enter+0x34: movq    $0,0x1f09b11(%rip)
db>

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 59655
Build 56542: arc lint + arc unit

Event Timeline

In general it makes sense to remove entry points (a device file in this case) first when tearing down a driver, so this is probably fine.

In general it makes sense to remove entry points (a device file in this case) first when tearing down a driver, so this is probably fine.

So is this good to go?

In general it makes sense to remove entry points (a device file in this case) first when tearing down a driver, so this is probably fine.

So is this good to go?

I don't see any problem with the diff. Does it fix the panics?

For the records, I consistently get a different panic when reproducing this (without the patch):

Fatal trap 9: general protection fault while in kernel mode
cpuid = 3; apic id = 03
instruction pointer     = 0x20:0xffffffff81073328
stack pointer           = 0x28:0xfffffe00d92d7610
frame pointer           = 0x28:0xfffffe00d92d7610
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = resume, IOPL = 0
current process         = 1368 (sh)
rdi: deadc0dedeadc0d8 rsi: 0000000000000000 rdx: fffffe00d92d78b0
rcx: deadc0dedeadc0de  r8: fefefefefefefeff  r9: 8080808080808080
rax: deadc0dedeadc0de rbx: 0000000000000000 rbp: fffffe00d92d7610
r10: deadc0dedeadc0de r11: 0000000000000000 r12: 0000000000000001
r13: 0000000000000000 r14: 00000000d92d7820 r15: 0000000000000000
trap number             = 9
panic: general protection fault
cpuid = 3
time = 1727998724
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d92d7350
vpanic() at vpanic+0x13f/frame 0xfffffe00d92d7480
panic() at panic+0x43/frame 0xfffffe00d92d74e0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00d92d7540
calltrap() at calltrap+0x8/frame 0xfffffe00d92d7540
--- trap 0x9, rip = 0xffffffff81073328, rsp = 0xfffffe00d92d7610, rbp = 0xfffffe00d92d7610 ---
strlen() at strlen+0x28/frame 0xfffffe00d92d7610
kvprintf() at kvprintf+0xfa4/frame 0xfffffe00d92d7720
vsnprintf() at vsnprintf+0x2e/frame 0xfffffe00d92d7740
vpanic() at vpanic+0xce/frame 0xfffffe00d92d7870
panic() at panic+0x43/frame 0xfffffe00d92d78d0
__mtx_lock_flags() at __mtx_lock_flags+0x12c/frame 0xfffffe00d92d7920
_cv_wait() at _cv_wait+0x1a8/frame 0xfffffe00d92d7990
dsp_open() at dsp_open+0x19f/frame 0xfffffe00d92d7a00
devfs_open() at devfs_open+0x12f/frame 0xfffffe00d92d7a70
VOP_OPEN_APV() at VOP_OPEN_APV+0x2f/frame 0xfffffe00d92d7a90
vn_open_vnode() at vn_open_vnode+0x1b9/frame 0xfffffe00d92d7b00
vn_open_cred() at vn_open_cred+0x598/frame 0xfffffe00d92d7c80
openatfp() at openatfp+0x287/frame 0xfffffe00d92d7dd0
sys_openat() at sys_openat+0x3d/frame 0xfffffe00d92d7e00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00d92d7f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d92d7f30
--- syscall (499, FreeBSD ELF64, openat), rip = 0x86fbf3086fa, rsp = 0x86fbad739c8, rbp = 0x86fbad73ac0 ---
KDB: enter: panic
[ thread pid 1368 tid 100396 ]
Stopped at      kdb_enter+0x33: movq    $0,0x105a1b2(%rip)

Unfortunately, I get a slightly different panic with this patch installed:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex dsp1.p1 (pcm play channel) r = 0 (0xfffff800015bb960) locked @ /usr/src/sys/dev/sound/pcm/vchan.c:161
stack backtrace:
#0 0xffffffff80bc40bc at witness_debugger+0x6c
#1 0xffffffff80bc52b3 at witness_warn+0x403
#2 0xffffffff810770c0 at trap_pfault+0x80
#3 0xffffffff810491a8 at calltrap+0x8
#4 0xffffffff808c60ea at chn_trigger+0xea
#5 0xffffffff808c6a57 at chn_abort+0x57
#6 0xffffffff808e2dce at pcm_unregister+0x13e
#7 0xffffffff8351025b at dummy_detach+0x2b
#8 0xffffffff80b8a894 at device_detach+0x194
#9 0xffffffff80b89e76 at devclass_driver_deleted+0x66
#10 0xffffffff80b89d61 at devclass_delete_driver+0x81
#11 0xffffffff80b909b7 at driver_module_handler+0x87
#12 0xffffffff80b28042 at module_unload+0x32
#13 0xffffffff80b1710b at linker_file_unload+0x1eb
#14 0xffffffff80b18a90 at kern_kldunload+0x190
#15 0xffffffff810779e8 at amd64_syscall+0x158
#16 0xffffffff81049abb at fast_syscall_common+0xf8


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x120
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff808e6673
stack pointer           = 0x28:0xfffffe00d9377b60
frame pointer           = 0x28:0xfffffe00d9377b90
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 1283 (kldunload)
rdi: fffff800015bb960 rsi: 0000000000000008 rdx: 0000000000000000
rcx: 0000000000000000  r8: 0000000000000003  r9: ffffffffffffffff
rax: 0000000000000000 rbx: fffff8016442ec00 rbp: fffffe00d9377b90
r10: 0000000000000000 r11: 0000000000000004 r12: fffff8016442bc00
r13: 0000000000000001 r14: fffff8016442bc00 r15: 00000000ffffffff
trap number             = 12
panic: page fault
cpuid = 2
time = 1727999742
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d9377830
vpanic() at vpanic+0x13f/frame 0xfffffe00d9377960
panic() at panic+0x43/frame 0xfffffe00d93779c0
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00d9377a20
trap_pfault() at trap_pfault+0xa0/frame 0xfffffe00d9377a90
calltrap() at calltrap+0x8/frame 0xfffffe00d9377a90
--- trap 0xc, rip = 0xffffffff808e6673, rsp = 0xfffffe00d9377b60, rbp = 0xfffffe00d9377b90 ---
vchan_trigger() at vchan_trigger+0xc3/frame 0xfffffe00d9377b90
chn_trigger() at chn_trigger+0xea/frame 0xfffffe00d9377bd0
chn_abort() at chn_abort+0x57/frame 0xfffffe00d9377c00
pcm_unregister() at pcm_unregister+0x13e/frame 0xfffffe00d9377c30
dummy_detach() at dummy_detach+0x2b/frame 0xfffffe00d9377c50
device_detach() at device_detach+0x194/frame 0xfffffe00d9377c90
devclass_driver_deleted() at devclass_driver_deleted+0x66/frame 0xfffffe00d9377cd0
devclass_delete_driver() at devclass_delete_driver+0x81/frame 0xfffffe00d9377d10
driver_module_handler() at driver_module_handler+0x87/frame 0xfffffe00d9377d60
module_unload() at module_unload+0x32/frame 0xfffffe00d9377d80
linker_file_unload() at linker_file_unload+0x1eb/frame 0xfffffe00d9377dc0
kern_kldunload() at kern_kldunload+0x190/frame 0xfffffe00d9377e00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00d9377f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d9377f30
--- syscall (444, FreeBSD ELF64, kldunloadf), rip = 0xc826034849a, rsp = 0xc825f3fa538, rbp = 0xc825f3fad90 ---
KDB: enter: panic
[ thread pid 1283 tid 100417 ]
Stopped at      kdb_enter+0x33: movq    $0,0x105a1b2(%rip)

@christos Any prerequisites for this? I have tested this on top of your other changes that are up for review. Except for D46700, and possibly other changes where I'm not involved as reviewer.

Interesting. What do you do exactly to reproduce those? Also what is the configuration you're using?

Interesting. What do you do exactly to reproduce those? Also what is the configuration you're using?

I just put your reproduce code into a shell script. Setup is a bhyve VM, 4 cores, pcm0 is a passthrough USB audio interface, script operates on pcm1 which is snd_dummy as in your example.

Interesting. What do you do exactly to reproduce those? Also what is the configuration you're using?

I just put your reproduce code into a shell script. Setup is a bhyve VM, 4 cores, pcm0 is a passthrough USB audio interface, script operates on pcm1 which is snd_dummy as in your example.

I tried this exact scenario, but still cannot reproduce the panic after multiple times of running the script. Is there some non-default configuration you're using?

I just put your reproduce code into a shell script. Setup is a bhyve VM, 4 cores, pcm0 is a passthrough USB audio interface, script operates on pcm1 which is snd_dummy as in your example.

I tried this exact scenario, but still cannot reproduce the panic after multiple times of running the script. Is there some non-default configuration you're using?

Nothing special, I don't have any non-default config in this test VM. It took me about 7 attempts to reproduce this with 50 channels (as in your example), I think it took less attempts with 100 as I had in my script originally. I also updated src in the meantime, same panic.

I just put your reproduce code into a shell script. Setup is a bhyve VM, 4 cores, pcm0 is a passthrough USB audio interface, script operates on pcm1 which is snd_dummy as in your example.

I tried this exact scenario, but still cannot reproduce the panic after multiple times of running the script. Is there some non-default configuration you're using?

Nothing special, I don't have any non-default config in this test VM. It took me about 7 attempts to reproduce this with 50 channels (as in your example), I think it took less attempts with 100 as I had in my script originally. I also updated src in the meantime, same panic.

Right, so after a couple more tries I got the following panic, which is what I was getting before the patch as well... So seems like the patch does make the code robust, but not completely:

pcm0: <Dummy Audio Device>
hw.snd.maxautovchans: 16 -> 256
pcm0: detached


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 01
fault virtual address   = 0xffffffff841bbeb0
fault code              = supervisor read instruction, page not present
instruction pointer     = 0x20:0xffffffff841bbeb0
stack pointer           = 0x28:0xfffffe0046a16bd8
frame pointer           = 0x28:0xfffffe0046a16c10
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (thread taskq)
rdi: fffffe0046909260 rsi: fffffe0046a16b60 rdx: 0000000000000000
rcx: 0000000000000000  r8: 0000000000000000  r9: 0000000000000001
rax: 000000000b9b56a9 rbx: fffffe005cdab540 rbp: fffffe0046a16c10
r10: 0000000000000000 r11: 0000000000000006 r12: fffffe004a5ec020
r13: fffffe004a507a30 r14: ffffffff841bbeb0 r15: fffffe004a5ec000
trap number             = 12
panic: page fault
cpuid = 1
time = 1728856313
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame 0xfffffe0046a16330
kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfffffe0046a16490
vpanic() at vpanic+0x226/frame 0xfffffe0046a16630
panic() at panic+0xb5/frame 0xfffffe0046a166f0
trap_fatal() at trap_fatal+0x65b/frame 0xfffffe0046a167f0
trap_pfault() at trap_pfault+0x12b/frame 0xfffffe0046a16910
trap() at trap+0x54c/frame 0xfffffe0046a16b00
calltrap() at calltrap+0x8/frame 0xfffffe0046a16b00
--- trap 0xc, rip = 0xffffffff841bbeb0, rsp = 0xfffffe0046a16bd8, rbp = 0xfffffe0046a16c10 ---
ngsocklist() at 0xffffffff841bbeb0/frame 0xfffffe0046a16c10
destroy_devl() at destroy_devl+0x5d5/frame 0xfffffe0046a16d00
destroy_dev_tq() at destroy_dev_tq+0x1bc/frame 0xfffffe0046a16d50
taskqueue_run_locked() at taskqueue_run_locked+0x1ff/frame 0xfffffe0046a16eb0
taskqueue_thread_loop() at taskqueue_thread_loop+0x138/frame 0xfffffe0046a16ef0
fork_exit() at fork_exit+0xa3/frame 0xfffffe0046a16f30
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0046a16f30
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100014 ]
Stopped at      kdb_enter+0x34: movq    $0,0x1f09b11(%rip)
db>