Page MenuHomeFreeBSD

pf: prevent SCTP-based NULL dereference in pfi_kkif_match()
AbandonedPublic

Authored by franco_opnsense.org on Nov 18 2024, 10:06 AM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Jan 9, 10:11 PM
Unknown Object (File)
Dec 19 2024, 6:57 AM
Unknown Object (File)
Dec 13 2024, 1:28 AM
Unknown Object (File)
Nov 26 2024, 8:20 AM
Unknown Object (File)
Nov 24 2024, 4:39 PM
Unknown Object (File)
Nov 23 2024, 7:50 PM
Unknown Object (File)
Nov 23 2024, 3:12 PM
Unknown Object (File)
Nov 22 2024, 6:00 AM

Details

Reviewers
kp
Summary

It appears that 0fe663b2a815 can cause a panic when being run
as it reaches for IFG_ALL with a NULL pointer for the pfik_ifp
inside packet_kif. The stack trace as presented by the reporter:

  • trap 0xc, rip = 0xffffffff821ab744, rsp = 0xfffffe00625cef50, rbp = 0xfffffe00625cef50 --- pfi_kkif_match() at pfi_kkif_match+0x24/frame 0xfffffe00625cef50 pf_test_rule() at pf_test_rule+0xe6b/frame 0xfffffe00625cf3a0 pf_sctp_multihome_delayed() at pf_sctp_multihome_delayed+0x30e/frame 0xfffffe00625cf4d0 pf_test() at pf_test+0xd1a/frame 0xfffffe00625cf680 pf_check_in() at pf_check_in+0x27/frame 0xfffffe00625cf6a0 pfil_mbuf_in() at pfil_mbuf_in+0x38/frame 0xfffffe00625cf6d0 enc_hhook() at enc_hhook+0x28a/frame 0xfffffe00625cf710 hhook_run_hooks() at hhook_run_hooks+0x61/frame 0xfffffe00625cf780 ipsec_run_hhooks() at ipsec_run_hhooks+0x6d/frame 0xfffffe00625cf7a0 ipsec4_common_input_cb() at ipsec4_common_input_cb+0x32a/frame 0xfffffe00625cf830 esp_input_cb() at esp_input_cb+0x430/frame 0xfffffe00625cf8e0 swcr_process() at swcr_process+0x25/frame 0xfffffe00625cf900 crypto_dispatch() at crypto_dispatch+0x60/frame 0xfffffe00625cf920 esp_input() at esp_input+0x4d8/frame 0xfffffe00625cf9f0 udp_ipsec_input() at udp_ipsec_input+0x17b/frame 0xfffffe00625cfa50 ipsec_kmod_udp_input() at ipsec_kmod_udp_input+0x2d/frame 0xfffffe00625cfa70 udp_append() at udp_append+0xe4/frame 0xfffffe00625cfae0 udp_input() at udp_input+0x803/frame 0xfffffe00625cfbc0 ip_input() at ip_input+0x268/frame 0xfffffe00625cfc20 netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe00625cfc70 ether_demux() at ether_demux+0x149/frame 0xfffffe00625cfca0 ether_nh_input() at ether_nh_input+0x36a/frame 0xfffffe00625cfd00 netisr_dispatch_src() at netisr_dispatch_src+0x9e/frame 0xfffffe00625cfd50 ether_input() at ether_input+0x56/frame 0xfffffe00625cfda0 re_rxeof() at re_rxeof+0x547/frame 0xfffffe00625cfe20 re_intr_msi() at re_intr_msi+0xf3/frame 0xfffffe00625cfe60 ithread_loop() at ithread_loop+0x257/frame 0xfffffe00625cfef0 fork_exit() at fork_exit+0x7f/frame 0xfffffe00625cff30 fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00625cff30
  • trap 0, rip = 0, rsp = 0, rbp = 0 ---

PR: https://github.com/opnsense/src/issues/227
Sponsored by: OPNsense

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 60662
Build 57546: arc lint + arc unit

Event Timeline

The backtrace appears to have gotten mangled here.
Also, a gdb backtrace would be more useful as it'll decode to line numbers rather than to addresses.

This strikes me as overly broad. "We hit a NULL pointer here, so let's just check for that in this exact location" isn't an ideal approach to a fix.

We're not usually going to hit a 'packet_kif' that doesn't have a real interface associated with it (i.e. it's expected to always have pfik_ifp set). The SCTP multi home path is somewhat unusual there. Either that needs to be changed (although then we'd probably end up having to add special case flags to pf_test_rule(), which isn't ideal), or pfi_kkif_match() needs to check packet_kif against V_pfi_all. That in turn raises an interesting question: what is the expected behaviour of an interface matching rule if the interface isn't know when the rule is evaluated?
Perhaps this does suggest that the former option needs to be explored.

Finally, this really wants a test case. Presumably the crash scenario is when there's a multi homed SCTP connection (test examples already exist) and the ruleset contains a rule that matches on interface (which I believe is not something that's currently tested).

Sure. For context:

(kgdb) frame 18
#18 pfi_kkif_match (rule_kif=0xfffff8001830a100, packet_kif=packet_kif@entry=0xfffff80005b1ee00) at /usr/src/sys/netpfil/pf/pf_if.c:433
433			CK_STAILQ_FOREACH(p, &packet_kif->pfik_ifp->if_groups, ifgl_next)
(kgdb) p *packet_kif
$1 = {pfik_name = "all", '\000' <repeats 12 times>, _pfik_glue = {_pfik_tree = {rbe_link = {0xfffff80018427900, 0x0, 0x0}}, _pfik_list = {le_next = 0xfffff80018427900, le_prev = 0x0}}, 
  pfik_packets = {{{{counter = 0xfffffe00e7046d70}, {counter = 0xfffffe00e7046d60}}, {{counter = 0xfffffe00e7046d50}, {counter = 0xfffffe00e7046d40}}}, {{{counter = 0xfffffe00e7046d30}, {
          counter = 0xfffffe00e7046d20}}, {{counter = 0xfffffe00e7046d10}, {counter = 0xfffffe00e7046d00}}}}, pfik_bytes = {{{{counter = 0xfffffe00e7046d68}, {counter = 0xfffffe00e7046d58}}, {{
          counter = 0xfffffe00e7046d48}, {counter = 0xfffffe00e7046d38}}}, {{{counter = 0xfffffe00e7046d28}, {counter = 0xfffffe00e7046d18}}, {{counter = 0xfffffe00e7046d08}, {
          counter = 0xfffffe00e7046cf8}}}}, pfik_tzero = 1731419937, pfik_flags = 1, pfik_ifp = 0x0, pfik_group = 0xfffff8000397c340, pfik_rulerefs = 37, pfik_dynaddrs = {
    tqh_first = 0xfffff80018214e00, tqh_last = 0xfffff800187c4d00}}
(kgdb) frame 19
#19 0xffffffff8238a953 in pf_test_rule (rm=rm@entry=0xfffffe006259b358, sm=sm@entry=0xfffffe006259b388, kif=0xfffff80005b1ee00, m=0xfffff800ca08e400, off=off@entry=20, 
    pd=pd@entry=0xfffff8001835a210, am=0xfffffe006259b338, rsm=0xfffffe006259b340, inp=0x0) at /usr/src/sys/netpfil/pf/pf.c:4651
4651			if (pfi_kkif_match(r->kif, kif) == r->ifnot)
(kgdb) list
4646		}
4647	
4648		SLIST_INIT(&match_rules);
4649		while (r != NULL) {
4650			pf_counter_u64_add(&r->evaluations, 1);
4651			if (pfi_kkif_match(r->kif, kif) == r->ifnot)
4652				r = r->skip[PF_SKIP_IFP].ptr;
4653			else if (r->direction && r->direction != pd->dir)
4654				r = r->skip[PF_SKIP_DIR].ptr;
4655			else if (r->af && r->af != af)

If IFG_ALL is not supposed to have a pfik_ifp set why are we passing it downwards to create floating states as per 0fe663b2a815 that eventually runs into this? At which layer should this be prevented from happening?

Sure. For context:

<snip>
The entire backtrace would have been nice.
Still, that's not the most important thing right now. Concentrate on the test case, which will mean we can trivially reproduce backtraces and we can circle back to improving the commit message once we have both the test case and an actual fix.

If IFG_ALL is not supposed to have a pfik_ifp set why are we passing it downwards to create floating states as per 0fe663b2a815 that eventually runs into this? At which layer should this be prevented from happening?

The pd->kif is set pretty early on in the pf flow, via pf_setup_pdesc(): https://cgit.freebsd.org/src/tree/sys/netpfil/pf/pf.c#n8649
That code got reshuffled a little, but not in ways that matter to this discussion.

There's no separate layer that handles this later. The issue is one of mismatches assumptions between the SCTP multihome code and pfi_kkif_match().

To reiterate, the priorities now are, in this order:

  • a test case
  • Answering "what is the expected behaviour of an interface matching rule if the interface isn't know when the rule is evaluated?", so we can figure out how to best fix this
  • actually fixing the problem
  • writing up a useful commit message, explaining both the problem and the chosen solution, with any tradeoffs that may be involved.
In D47658#1086724, @kp wrote:

The entire backtrace would have been nice.

Just let me know what you need? Just "bt" or more?

  • a test case

I'll see what I can do. Thanks so far.

In D47658#1086724, @kp wrote:

The entire backtrace would have been nice.

Just let me know what you need? Just "bt" or more?

Let's start with the full backtrace, yes.

Here it is:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
#1  doadump (textdump=textdump@entry=0) at /usr/src/sys/kern/kern_shutdown.c:405
#2  0xffffffff8049c36a in db_dump (dummy=<optimized out>, dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>) at /usr/src/sys/ddb/db_command.c:591
#3  0xffffffff8049c16d in db_command (last_cmdp=<optimized out>, cmd_table=<optimized out>, dopager=false) at /usr/src/sys/ddb/db_command.c:504
#4  0xffffffff8049c2b6 in db_command_script (command=command@entry=0xffffffff81bbf6d3 <db_recursion_data+3> "dump") at /usr/src/sys/ddb/db_command.c:569
#5  0xffffffff804a1528 in db_script_exec (scriptname=<optimized out>, warnifnotfound=warnifnotfound@entry=0) at /usr/src/sys/ddb/db_script.c:302
#6  0xffffffff804a1435 in db_script_kdbenter (eventname=<optimized out>) at /usr/src/sys/ddb/db_script.c:325
#7  0xffffffff8049f4f1 in db_trap (type=<optimized out>, code=<optimized out>) at /usr/src/sys/ddb/db_main.c:267
#8  0xffffffff80c09668 in kdb_trap (type=type@entry=3, code=code@entry=0, tf=tf@entry=0xfffffe006259aab0) at /usr/src/sys/kern/subr_kdb.c:790
#9  0xffffffff810e0419 in trap (frame=0xfffffe006259aab0) at /usr/src/sys/amd64/amd64/trap.c:608
#10 <signal handler called>
#11 kdb_enter (why=<optimized out>, msg=<optimized out>) at /usr/src/sys/kern/subr_kdb.c:556
#12 0xffffffff80bb8fd2 in vpanic (fmt=0xffffffff81266209 "%s", ap=ap@entry=0xfffffe006259ace0) at /usr/src/sys/kern/kern_shutdown.c:955
#13 0xffffffff80bb9083 in panic (fmt=0xffffffff81d82c18 <cnputs_mtx+24> "") at /usr/src/sys/kern/kern_shutdown.c:891
#14 0xffffffff810e0eab in trap_fatal (frame=0xfffffe006259adc0, eva=24) at /usr/src/sys/amd64/amd64/trap.c:952
#15 0xffffffff810e0f07 in trap_pfault (frame=<optimized out>, usermode=<optimized out>, signo=<optimized out>, ucode=<optimized out>) at /usr/src/sys/amd64/amd64/trap.c:760
#16 <signal handler called>
#17 ck_pr_md_load_ptr (target=<optimized out>) at /usr/src/sys/contrib/ck/include/gcc/x86_64/ck_pr.h:185
#18 pfi_kkif_match (rule_kif=0xfffff8001830a100, packet_kif=packet_kif@entry=0xfffff80005b1ee00) at /usr/src/sys/netpfil/pf/pf_if.c:433
#19 0xffffffff8238a953 in pf_test_rule (rm=rm@entry=0xfffffe006259b358, sm=sm@entry=0xfffffe006259b388, kif=0xfffff80005b1ee00, m=0xfffff800ca08e400, off=off@entry=20, 
    pd=pd@entry=0xfffff8001835a210, am=0xfffffe006259b338, rsm=0xfffffe006259b340, inp=0x0) at /usr/src/sys/netpfil/pf/pf.c:4651
#20 0xffffffff82390d14 in pf_sctp_multihome_delayed (pd=pd@entry=0xfffffe006259b450, off=20, kif=0xfffff80005b1d700, s=s@entry=0xfffff800ca33a580, action=<optimized out>)
    at /usr/src/sys/netpfil/pf/pf.c:6177
#21 0xffffffff82385f29 in pf_test (dir=dir@entry=1, pflags=<optimized out>, ifp=0xfffff80005bd5000, m0=m0@entry=0xfffffe006259b730, inp=<optimized out>, 
    default_actions=default_actions@entry=0x0) at /usr/src/sys/netpfil/pf/pf.c:8506
#22 0xffffffff823adf27 in pf_check_in (m=0xfffffe006259b730, ifp=0x1, flags=0, ruleset=<optimized out>, inp=0xffffffdb) at /usr/src/sys/netpfil/pf/pf_ioctl.c:6575
#23 0xffffffff80d19dd8 in pfil_mbuf_common (pch=<optimized out>, m=0xfffffe006259b730, ifp=0xfffff80005bd5000, ifp@entry=0x0, flags=65536, inp=0x0) at /usr/src/sys/net/pfil.c:212
#24 pfil_mbuf_in (head=<optimized out>, m=0xfffffe006259b730, ifp=ifp@entry=0xfffff80005bd5000, inp=0x0) at /usr/src/sys/net/pfil.c:230
#25 0xffffffff8245867a in enc_hhook (hhook_type=<optimized out>, hhook_id=<optimized out>, udata=<optimized out>, ctx_data=<optimized out>, hdata=<optimized out>, hosd=<optimized out>)
    at /usr/src/sys/net/if_enc.c:304
#26 0xffffffff80b6c7ff in hhook_run_hooks (hhh=0xfffff800035c2300, ctx_data=ctx_data@entry=0xfffffe006259b710, hosd=hosd@entry=0x0) at /usr/src/sys/kern/kern_hhook.c:121
#27 0xffffffff82c353cd in ipsec_run_hhooks (ctx=ctx@entry=0xfffffe006259b710, type=type@entry=3) at /usr/src/sys/netipsec/ipsec.c:862
#28 0xffffffff82c380a4 in ipsec4_common_input_cb (m=0xfffff800ca08e400, sav=0xfffff8001877be00, skip=<optimized out>, protoff=protoff@entry=9) at /usr/src/sys/netipsec/ipsec_input.c:484
--Type <RET> for more, q to quit, c to continue without paging--
#29 0xffffffff82c3f4ad in esp_input_cb (crp=0x0) at /usr/src/sys/netipsec/xform_esp.c:667
#30 0xffffffff80ebfdb5 in swcr_process (dev=<optimized out>, crp=0xfffff80072ef5348, hint=<optimized out>) at /usr/src/sys/opencrypto/cryptosoft.c:1682
#31 0xffffffff80ebec3c in CRYPTODEV_PROCESS (dev=0xfffff80003758300, op=0xfffff80072ef5348, flags=<optimized out>) at ./cryptodev_if.h:161
#32 crypto_invoke (cap=<optimized out>, crp=crp@entry=0xfffff80072ef5348, hint=hint@entry=0) at /usr/src/sys/opencrypto/crypto.c:1583
#33 0xffffffff80ebd774 in crypto_dispatch_one (crp=crp@entry=0xfffff80072ef5348, hint=hint@entry=0) at /usr/src/sys/opencrypto/crypto.c:1432
#34 0xffffffff80ebd679 in crypto_dispatch (crp=0xffffffff81ddc480 <epoch_array+384>, crp@entry=0xfffff80072ef5348) at /usr/src/sys/opencrypto/crypto.c:1448
#35 0xffffffff82c3e36e in esp_input (m=0xfffff800ca08e400, sav=0xfffff8001877be00, skip=20, protoff=<optimized out>) at /usr/src/sys/netipsec/xform_esp.c:475
#36 0xffffffff82c415a7 in udp_ipsec_input (m=0xfffff800ca08e400, off=9, af=<optimized out>) at /usr/src/sys/netipsec/udpencap.c:238
#37 0xffffffff80e3b0bd in ipsec_kmod_udp_input (sc=sc@entry=0xffffffff81e16aa8 <ipv4_ipsec>, m=0xffffffff81ddc480 <epoch_array+384>, m@entry=0xfffff800ca08e400, off=0, off@entry=28, 
    af=0, af@entry=2) at /usr/src/sys/netipsec/subr_ipsec.c:333
#38 0xffffffff80de3d32 in udp_append (inp=inp@entry=0xfffff800b7425000, ip=ip@entry=0xfffff800b740c80e, n=0xfffff800ca08e400, off=28, off@entry=20, 
    udp_in=udp_in@entry=0xfffffe006259bad0) at /usr/src/sys/netinet/udp_usrreq.c:272
#39 0xffffffff80de3703 in udp_input (mp=<optimized out>, offp=<optimized out>, proto=17) at /usr/src/sys/netinet/udp_usrreq.c:663
#40 0xffffffff80d9f6b0 in ip_input (m=0x0) at /usr/src/sys/netinet/ip_input.c:888
#41 0xffffffff80d15e5e in netisr_dispatch_src (proto=proto@entry=1, source=source@entry=0, m=0xfffff800ca08e400) at /usr/src/sys/net/netisr.c:1152
#42 0xffffffff80d161ff in netisr_dispatch (proto=2178794624, proto@entry=1, m=0xfffff80003160278) at /usr/src/sys/net/netisr.c:1243
#43 0xffffffff80cf7729 in ether_demux (ifp=ifp@entry=0xfffff800035db000, m=0xfffff800ca08e400) at /usr/src/sys/net/if_ethersubr.c:960
#44 0xffffffff80cf8ed9 in ether_input_internal (ifp=0xfffff800035db000, m=0xfffff800ca08e400) at /usr/src/sys/net/if_ethersubr.c:718
#45 ether_nh_input (m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:748
#46 0xffffffff80d15e5e in netisr_dispatch_src (proto=proto@entry=5, source=source@entry=0, m=0xfffff800ca08e400) at /usr/src/sys/net/netisr.c:1152
#47 0xffffffff80d161ff in netisr_dispatch (proto=2178794624, proto@entry=5, m=0xfffff80003160278) at /usr/src/sys/net/netisr.c:1243
#48 0xffffffff80cf7c25 in ether_input (ifp=0xfffff800035db000, m=<optimized out>) at /usr/src/sys/net/if_ethersubr.c:865
#49 0xffffffff808767e5 in re_rxeof (sc=sc@entry=0xfffffe0062717000, rx_npktsp=0x0) at /usr/src/sys/dev/re/if_re.c:2387
#50 0xffffffff80874263 in re_intr_msi (xsc=0xfffffe0062717000) at /usr/src/sys/dev/re/if_re.c:2683
#51 0xffffffff80b6fe16 in intr_event_execute_handlers (ie=0xfffff800035d2700, p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1205
#52 ithread_execute_handlers (ie=0xfffff800035d2700, p=<optimized out>) at /usr/src/sys/kern/kern_intr.c:1218
#53 ithread_loop (arg=arg@entry=0xfffff800035e4e00) at /usr/src/sys/kern/kern_intr.c:1306
#54 0xffffffff80b6c252 in fork_exit (callout=0xffffffff80b6fbc0 <ithread_loop>, arg=0xfffff800035e4e00, frame=0xfffffe006259bf40) at /usr/src/sys/kern/kern_fork.c:1164
#55 <signal handler called>

For the record I have no stakes in SCTP and I'm not involved in the changes done here. This work with the user reporting it is a courtesy to the FreeBSD pf code and to triage panics which I only think is a good idea for production code.