The filesystem has significant bugs, as usual related to the vnode lifecycle and its interaction with the mqueue lifecycle. Fixing it would take a lot of efforts which seems to be futile. E.g., the trivial immediately panicing issue fixed in f0a4dd6d46e99d47fde1 prevented mqueuefs mount at all, and was there for quite some time. PR: 278936
Details
- Reviewers
markj
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Still crashes with this fix & commit b6f4a3fa75d24637b4d81035655fcb3d3ea187ad applied.
Now crashing like this:
panic: Bad list head 0xffffffff80f31ec0 first->prev != head
cpuid = 2
time = 1716697553
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00da25e9a0
vpanic() at vpanic+0x13d/frame 0xfffffe00da25ead0
panic() at panic+0x43/frame 0xfffffe00da25eb30
do_unlink() at do_unlink+0x2d8/frame 0xfffffe00da25eb60
mqfs_remove() at mqfs_remove+0x59/frame 0xfffffe00da25eb90
VOP_REMOVE_APV() at VOP_REMOVE_APV+0x3a/frame 0xfffffe00da25ebb0
kern_funlinkat() at kern_funlinkat+0x473/frame 0xfffffe00da25ede0
sys_unlink() at sys_unlink+0x28/frame 0xfffffe00da25ee00
amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe00da25ef30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00da25ef30
- syscall (10, FreeBSD ELF64, unlink), rip = 0x17dae350b52a, rsp = 0x17dae0fb24d8, rbp = 0x17dae0fb25f0 ---
KDB: enter: panic
I have no particular objection to removal of the filesystem, but mqueuefs.4 and posixmqcontrol.1 need to be updated as well to remove mentions of mount points.
E.g., the trivial immediately panicing issue fixed in f0a4dd6d46e99d47fde1 prevented mqueuefs mount at all
What happens with a non-invariants kernel? Does it panic as well?
sys/kern/uipc_mqueue.c | ||
---|---|---|
2907 | How does unloadable get set? |
There is definitely a memory corruption issue, and from the report, it seems that panic is typically not triggered (immediately).
Most prominent is the running task with struct task and vnode freed, then vhold() etc called on the freed memory.
Of course, there are very serious issues with unmount.
share/man/man4/mqueuefs.4 | ||
---|---|---|
49 ↗ | (On Diff #139149) | This man page documents the whole mqueue module, not just the filesystem component. IMO it should still be kept, with references to the filesystem removed. |
The panic was immediate, and not related with unmount.
We'll need a way to list mqueues in the system. Otherwise we have no way to know about them and the user may create 100 queues that root can't fix by removing them.
I mean, that besides the issue you uncovered, and other issues with the interaction between vnode and mqueue lifetime, there are also hard to fix unmount races.
We'll need a way to list mqueues in the system. Otherwise we have no way to know about them and the user may create 100 queues that root can't fix by removing them.
I will add list/ls to posixmqcontrol, might be after rewriting it from scratch.
share/man/man4/mqueuefs.4 | ||
---|---|---|
49 ↗ | (On Diff #139149) | The documentation for whole module is single sentence, 'The module contains system calls to manipulate All other text is about the filesystem proper. I do not see it useful to create single-sentence man page. |