I've been working on https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272770 ("divert-to" rule creates packet loops on all FreeBSD 11.0 to 14.0 CURRENT versions).
I've done some research and my current context and conclusions are as follows.
The history I've collected:
- Originally "divert-to" in OpenBSD was not about divert(4)
- divert(4) support was added to OpenBSD as "divert-packet <divport>" syntax in 2009
- there is no "divert-packet" syntax in FreeBSD
- divert(4) support in FreeBSD was originally built around ipfw with rule numbering concept in mind
- "divert-to <addr> port <port>" syntax in FreeBSD can be used for divert(4) since 9.x
/* 'divapp' stands for an app who reads from a divert socket and probably writes back to */
/* 'divsrc' and 'divdst' are source or destination sockaddr_in within divert(4) socket connection */
Considering a rule like "pass in proto udp from any to any port 53 divert-to 127.0.0.1 port 3355" the sequence of events is as follows:
- divapp gets a packet via recvfrom() with divsrc port set to the initial packet direction (PF_IN or PF_OUT, i.e. literally 1 or 2)
- divapp writes back via sendto() with the same divsrc port as divdst port (it's expected to be opaque value for a divapp)
- upon write divert(4)'s logic adds ipfw_rule_ref mtag with rulenum = <divdst port> + 1 (what computes to PF_IN+1=2 or PF_OUT+1=3)
- pf_test() logic marks a diverted packet as PF_MTAG_FLAG_PACKET_LOOPED only if rulenum is 0, but actually it could be 2 or 3 only
- as a result, the packet written back by divapp is not marked as looped and if its content is not modified and it matches the same pf rule then it is diverted again -- and this creates the infinite loop of the same packet while divapp is running and the pf rule is active
The ipfw_rule_ref.rulenum has no meaning for pf, and it looks like this "&& rulenum == 0" condition was always there. So far, I have not found a point in the src history where this condition has some meaning in correspondence to divert(4) logic. I saw oldschool divert(4) with mtag.cookie and the current one with mtag.rulenum = port+1, and both do not yield rulenum=0.
A question comes why it was not spotted for many years if it's an obvious defect. Probably it's due to divert(4) historically was used via ipfw and if pf started to be used for divert(4) then probably packets are usually altered and do not match the same rule to be re-diverted. But now it seems we have pf+divert(4) users which expect ipfw like behaviour of re-injection of unalterted packet which does not repeat diversion.
OpenBSD's "divert-packet" logic is very simple, their divert(4)'s sendto() marks a packet with PF_TAG_DIVERTED_PACKET flag and pf simply does PF_PASS if such flag is set, without any rule consideration. Exactly as their man 4 divert says. I guess we could leave this logic for future as-is implementation of the same "divert-packet" syntax in FreeBSD pf, if it's wanted. And for now FreeBSD pf could go ahead with the current idea that a diverted packet from divapp goes through the ruleset (if no state is used, otherwise it gets PF_PASS'ed) but is not diverted again if it matches any divert-to rule.
This diff matches the change proposed by the author of the related PR: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=260867. Let's credit.
I would like to work on respective "tests/sys/netpfil/pf" tests after it's agreed on the approach.