if_enc(4) can pass IPsec payload to pfil(9) with the outer header or without it. In case of a small packet like ICMP, when mbuf cluster is not used, everything works fine. Otherwise, the first mbuf in a chain has m_len == 0 if it is asked to strip the outer header. pf was not handling such case, and erroneous reading of the outer IP header led to unexpected behavior.
Details
- Reviewers
glebius kp - Commits
- rG239e24eb0cf6: pf: Handle (*m0)->m_len < sizeof(struct ip) case
# kldload ipsec if_enc pf # kyua test -k /usr/tests/sys/netpfil/pf/Kyuafile if_enc
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Looks sane at first glance. I'll take a deeper look on Tuesday.
There's probably a few other places we want to do the pull-up too, such as the ethernet rules and the IPv6 rules.
Places like pf_isforlocal() might need an assert.
sys/netpfil/pf/pf.c | ||
---|---|---|
8084 | I'm a little unsure if it's worth checking this, or if we should just call m_pullup() unconditionally. |
Yes, the idea was to have a conclusion on IPv4+pf first, to make sure there is an agreement over the place and the way. Also, I spot the same issue on ipfw side. The next patches could extrapolate the respective new test cases with common code extraction as an option.
Let me know if it's expected to be a single big patch for pf, covering all the sub-cases: IPv4, IPv6, Ethernet, etc.
sys/netpfil/pf/pf.c | ||
---|---|---|
8084 | Yeah, my gut feeling is that CPU mechanisms like branch prediction and pipeline reset should be a lesser overhead here. The m_pullup call is expected to do m_get(); m_move_pkthdr(); bcopy(); chain unconditionally for M_EXT mbufS, which is expected to be the most frequent case, I guess. |
I'm actually very much surprised that this bug was there was all these years. This means most drivers never create such suboptimal packets.
P.S. We didn't pullup potential IP options. What happens if packet doesn't belong to a known protocol (TCP/UDP/etc) but has options and also options are allowed, are we going to dereference beyond sizeof(struct ip)?
sys/netpfil/pf/pf.c | ||
---|---|---|
8084 | Just augment it with __predict_false(m->m_len < sizeof(struct ip)). |
sys/netpfil/pf/pf.c | ||
---|---|---|
8084 | I considered this from the very beginning and ended up thinking of potential production cases with if_enc + pf, like the ones mentioned in PR273198. Hence, the pullup is expected to be used for most of the packets in such cases. Then it seems to be better to leave it for a hardware to decide. |
This update covers a potential race condition in the test case to avoid false negative.
A good point. FreeBSD PF does not read IPv4 options. FreeBSD PF has allow-opts syntax which simply works with ip.ip_hl > 5. It seems we could keep it simple for now until IPv4 options support is added to the FreeBSD PF, a chance of which is disputable. What do you think?
I suspect we've mostly gotten away with that because the IP stack looks at things before it hands it to pfil(9). Or at least, it does in most scenarios. Igor's found one where we don't, and then things break.
P.S. We didn't pullup potential IP options. What happens if packet doesn't belong to a known protocol (TCP/UDP/etc) but has options and also options are allowed, are we going to dereference beyond sizeof(struct ip)?
pf usually uses pf_pull_hdr() for things like that, and it does m_copydata(), which is safe no matter the mbuf layout.
We do need to take a careful look at the other instances of mtod() in pf, because there may be others (pf_test6() comes to mind) that need fixing.