Move vm_object_scan_all_shadowed from vm_object.c to swap_pager.c, and rename it. In the moved function, use vm_page and swblk iterators to advance through the objects. Avoid checking a backing page for busyness or validity more than once, or when it is beyond the upper boundof the scan.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
sys/vm/swap_pager.c | ||
---|---|---|
2505 | Since this is visible function, we must assert that object is swapping object, i.e. OBJ_SWAP. It is also probably reasonable to assert that it is OBJ_ANON, and that object->backing_object exists. | |
2527 | ||
2568 | Since you unbusied p right after tryxbusy(), this statement is no longer true. |
sys/vm/swap_pager.c | ||
---|---|---|
2505 | The call to swblk_iter_init_only in line 2517 leads to an OBJ_SWAP assertion. The code is written to handle non-ANON objects by immediately returning false; I don't understand why I should change that behavior. If backing_object is NULL, the kernel will crash immediately. I didn't think that we commonly added assertions that replace one immediate crash with another. | |
2568 | I think you're telling me to fix a comment, and not to rewrite the code. But I'm not sure of that. Also, if ps < pv, then p probably wasn't doing anything to affect pp. I'll rewrite the comment, but probably not very well. |
I ran tests with D47150.144979.patch for 17 hours before getting this seemingly unrelated ext2fs panic:
https://people.freebsd.org/~pho/stress/log/log0555.txt
sys/vm/swap_pager.c | ||
---|---|---|
2505 | I mean that OBJ_ANON should be asserted on object, not backing object. The object must be anon because it shadows the below object. | |
2568 | Then I do not understand why ever do the tryxbusy/unbusy dance. Can you just check that the page is not xbusy above? But what prevents unlocked lookup from busying the page and then the caller can do something incompatible with the collapse? |
sys/vm/vm_object.c | ||
---|---|---|
1757 | Can you explain "busy of p disallows fault handler to validate pp" in the case when pi < p->pindex? |
Add an assertion.
Don't do any unbusying of a backing page until the parent page has been checked out, in case that unbusying has some bad effect that I don't really understand. I assume that parent pages without backing pages (but backed by swap blocks) work okay now, and won't be affected by this.
sys/vm/swap_pager.c | ||
---|---|---|
2568 | I tried to take the tryxbusy/unbusy code that was already there and just avoid having the unbusy appear in so many places. I don't know why its there, which is why I'm not intentionally changing it. |
sys/vm/vm_object.c | ||
---|---|---|
1757 | Immediate case is the failed CoW. For it to happen, p must be valid, and pp was allocated by the fault handler, initially invalid. Then both p and pp are xbusy, the objects are not locked, and the fault handler copies content of p to pp. See vm_fault_cow(), the 'oh well lets copy' part. |
Rewrote some parts with perhaps a better understanding of what the busy/unbusy parts are intended to do.
I cannot boot with D47150.145065.patch:
: pci8: <ACPI PCI bus> on pcib8 igb0: <Intel(R) I350 (Copper)> port 0xd020-0xd03f mem 0xfb320000-0xfb33ffff,0xfb344000-0xfb347fff irq 16 at device 0.0 on pci8 igb0: EEPROM V1.63-0 eTrack 0x800009fa igb0: Using 1024 TX descriptors and 1024 RX descriptors igb0: queue equality override not set, capping rx_queues at 6 and tx_queues at 6 igb0: Usinollapse+0x607/frame 0xfffffe0108438cb0 vmspace_fork() at vmspace_fork+0xb67/frame 0xfffffe0108438d30 fork1() at fork1+0x4f8/frame 0xfffffe0108438da0 sys_fork() at sys_fork+0x54/frame 0xfffffe0108438e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe0108438f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0108438f30 --- syscall (2, FreeBSD ELF64, fork), rip = 0x10a3cebd165a, rsp = 0x10a3cad3ba08, rbp = 0x10a3cad3ba40 --- KDB: enter: panic [ thread pid 38 tid 100213 ] Stopped at kdb_enter+0x33: movq $0,0x1052702(%rip) db> show registers cs 0x20 ds 0x3b es 0x3b fs 0x13 gs 0x1b ss 0x28 rax 0x12 rcx 0x4c0cb7c143e92607 rdx 0xffffffff811ece61 rbx 0x100 rsp 0xfffffe0108438868 rbp 0xfffffe0108438990 rsi 0xfffffe0108438720 rdi 0xffffffff81b9d3c0 cnputs_mtx r8 0x10 r9 0x10 r10 0xf r11 0x10 r12 0 r13 0xfffff8000afc3e88 r14 0xffffffff811b0c1c r15 0xfffff8000a21a740 rip 0xffffffff80ba6503 kdb_enter+0x33 rflags 0x86 kdb_enter+0x33: movq $0,0x1052702(%rip) db> x/s version version: FreeBSD 15.0-CURRENT #1 main-n273016-fa573868f1879-dirty: Fri Oct 18 08:40:46 CEST 2024\012 pho@mercat1.netperf.freebsd.org:/usr/src/sys/amd64/compile/PHO\012 db> bt Tracing pid 38 tid 100213 td 0xfffff8000a21a740 kdb_enter() at kdb_enter+0x33/frame 0xfffffe0108438990 panic() at panic+0x43/frame 0xfffffe01084389f0 swap_pager_scan_all_shadowed() at swap_pager_scan_all_shadowed+0x27f/frame 0xfffffe0108438c30 vm_object_collapse() at vm_object_collapse+0x607/frame 0xfffffe0108438cb0 vmspace_fork() at vmspace_fork+0xb67/frame 0xfffffe0108438d30 fork1() at fork1+0x4f8/frame 0xfffffe0108438da0 sys_fork() at sys_fork+0x54/frame 0xfffffe0108438e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe0108438f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0108438f30 --- syscall (2, FreeBSD ELF64, fork), rip = 0x10a3cebd165a, rsp = 0x10a3cad3ba08, rbp = 0x10a3cad3ba40 --- db>
Oops. Forgot the panic string :)
db> show panic panic: Shadow object is not anonymous db> bt Tracing pid 38 tid 100213 td 0xfffff80014e12740 kdb_enter() at kdb_enter+0x33/frame 0xfffffe0108438990 panic() at panic+0x43/frame 0xfffffe01084389f0 swap_pager_scan_all_shadowed() at swap_pager_scan_all_shadowed+0x27f/frame 0xfffffe0108438c30 vm_object_collapse() at vm_object_collapse+0x607/frame 0xfffffe0108438cb0 vmspace_fork() at vmspace_fork+0xb67/frame 0xfffffe0108438d30 fork1() at fork1+0x4f8/frame 0xfffffe0108438da0 sys_fork() at sys_fork+0x54/frame 0xfffffe0108438e00 amd64_syscall() at amd64_syscall+0x158/frame 0xfffffe0108438f30 fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe0108438f30 --- syscall (2, FreeBSD ELF64, fork), rip = 0x6fb515f865a, rsp = 0x6fb4e3982d8, rbp = 0x6fb4e398310 --- db>
What was the assert? I can see only one case where it might be not true, when vm_map_entry_delete() calls collapse for kernel object. But this is definitely not the case for the Peter' backtrace.
KASSERT((object->flags & OBJ_ANON) == 0, ("Shadow object is not anonymous"));
I guess it was supposed to be != instead.
sys/vm/swap_pager.c | ||
---|---|---|
2577 | Here, we know that this turns into a call to swap_pager_haspage(), so the loop might benefit from maintaining an iterator for the swap blocks of the shadow object as well. |
sys/vm/swap_pager.c | ||
---|---|---|
2577 | I'll rename blks to backing_blks, and replace vm_pager_has_page with swap_pager_haspage, in anticipation of a subsequent change based on this suggestion, but I think it should be a separate change from this one to modify swap_pager_haspage and swp_pager_meta_lookup. |
sys/vm/swap_pager.c | ||
---|---|---|
2577 | Certainly, I'd also prefer not to make this patch more complex than it already is. |