Page MenuHomeFreeBSD

pf: add two asserts and align runtime debug for pf_get_addrs()
Needs ReviewPublic

Authored by franco_opnsense.org on Mon, Mar 3, 12:03 PM.
Tags
None
Referenced Files
F112555366: D49214.diff
Wed, Mar 19, 7:11 PM
Unknown Object (File)
Tue, Mar 18, 4:24 AM
Unknown Object (File)
Sun, Mar 16, 2:03 PM
Unknown Object (File)
Sat, Mar 15, 1:15 AM
Unknown Object (File)
Mon, Mar 10, 9:33 PM
Unknown Object (File)
Fri, Mar 7, 9:52 AM
Unknown Object (File)
Thu, Mar 6, 6:37 AM
Unknown Object (File)
Thu, Mar 6, 3:43 AM

Details

Summary

Looking into a table corruption while pf_get_addrs() is running
this can be triggering a message like pf_get_astats():

pfr_get_addrs: corruption detected (-226857)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 62781
Build 59665: arc lint + arc unit

Event Timeline

The description of the change doesn't make much sense. What is the race which causes us to copy fewer address than expected? Is it a bug, and here we're extending a band-aid? Or is the race benign, and then should we just remove the print?

I'm tracking down what looks to be a memory corruption of some sort:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address    = 0x28
fault code        = supervisor read data, page not present
instruction pointer    = 0x20:0xffffffff80d052f4
stack pointer            = 0x28:0xfffffe00b1080690
frame pointer            = 0x28:0xfffffe00b10806b0
code segment        = base 0x0, limit 0xfffff, type 0x1b
            = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process        = 85006 (pfctl)
rdi: fffff8017b23ce60 rsi: fffffe00b10806c8 rdx: 0000000000000000
rcx: fffff8017b23ce60  r8: 0000000000000000  r9: 8080808080808080
rax: 0000000000000000 rbx: fffffe00b10806c8 rbp: fffffe00b10806b0
r10: fffff80173a54000 r11: ffffffffffffffff r12: 0000000000000000
r13: fffffe00c8482000 r14: ffffffff821edeb0 r15: fffff8017b23ce60
trap number        = 12
panic: page fault
cpuid = 2
time = 1740080523
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00b1080380
vpanic() at vpanic+0x131/frame 0xfffffe00b10804b0
panic() at panic+0x43/frame 0xfffffe00b1080510
trap_fatal() at trap_fatal+0x40b/frame 0xfffffe00b1080570
trap_pfault() at trap_pfault+0x46/frame 0xfffffe00b10805c0
calltrap() at calltrap+0x8/frame 0xfffffe00b10805c0
--- trap 0xc, rip = 0xffffffff80d052f4, rsp = 0xfffffe00b1080690, rbp = 0xfffffe00b10806b0 ---
rn_walktree() at rn_walktree+0x54/frame 0xfffffe00b10806b0
pfr_get_addrs() at pfr_get_addrs+0x122/frame 0xfffffe00b1080710
pfioctl() at pfioctl+0x221e/frame 0xfffffe00b1080bf0
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00b1080c40
vn_ioctl() at vn_ioctl+0xce/frame 0xfffffe00b1080cb0
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00b1080cd0
kern_ioctl() at kern_ioctl+0x255/frame 0xfffffe00b1080d40
sys_ioctl() at sys_ioctl+0xff/frame 0xfffffe00b1080e00
amd64_syscall() at amd64_syscall+0x100/frame 0xfffffe00b1080f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00b1080f30
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x14d360b785fa, rsp = 0x14d35c65b028, rbp = 0x14d35c65b4c0 ---
KDB: enter: panic

Systems exhibiting this problem crash frequently (up to multiple times a day). Sometimes they happen in random locations in the vm subsystem, but this particular issue sticks out and reoccurs at a stable frequently. All vmcores we gathered are defunct so we have to resort to printfs and check locking paths. We already bumped the pfr_get_addrs() path to a write lock and tried to debug radix.c a bit, but it's crashing without any obvious failure which means this is coming from somewhere else. This counter assert was the first evidence we got after changing this to a runtime value.

Apparently this line is what causes the panic returning to the stored "next" pointer after traversing the subtree:

https://github.com/freebsd/freebsd-src/blob/c6c63b92effd6d0662977f11bf429127b9dc4407/sys/net/radix.c#L1090

I'm tracking down what looks to be a memory corruption of some sort:

Have you tried testing a GENERIC-KASAN kernel?

I suspect that bugzilla is a better place to continue debugging this, as the root cause is as-yet unknown. Please cc me and we can continue debugging.

removed assert in tree traversal hotpath

I'm tracking down what looks to be a memory corruption of some sort:

Have you tried testing a GENERIC-KASAN kernel?

Not yet but I can. The fact of the matter is that this only happens for less than 1‰ of installations, but these can reproduce it. Not all the time but sometimes within minutes suggesting some form of traffic being involved here.

I suspect that bugzilla is a better place to continue debugging this, as the root cause is as-yet unknown. Please cc me and we can continue debugging.

I removed the asserts from this commit. It would still be beneficial to align the error handling. The returned result is garbled anyway. I'll add a bug report later.

In my local testing KASAN allows me to push 100 mbit max, which is going to be a challenge asking the user to deploy this in production and to trigger the bug which only happens under traffic. I can try, but not optimistic.

Something is filling the table while we read it with the lock held in pfr_get_addrs() is what this situation tells me, which points to a missing lock somewhere else, maybe introduced as far back as https://cgit.freebsd.org/src/commit/?id=890612bbeb69f

Other ideas to approach this?