Page MenuHomeFreeBSD

ipsec: improve integration with unmapped mbufs (on Tx)
AcceptedPublic

Authored by kib on Tue, Dec 31, 7:42 AM.
Tags
None
Referenced Files
F106937081: D48265.diff
Tue, Jan 7, 6:13 PM
Unknown Object (File)
Tue, Jan 7, 1:07 PM
Unknown Object (File)
Tue, Jan 7, 12:35 PM
Unknown Object (File)
Tue, Jan 7, 12:02 PM
Unknown Object (File)
Tue, Jan 7, 9:31 AM
Unknown Object (File)
Tue, Jan 7, 6:39 AM
Unknown Object (File)
Tue, Jan 7, 6:35 AM
Unknown Object (File)
Tue, Jan 7, 6:01 AM

Details

Reviewers
jhb
markj
Summary

IPv4 only ATM, want to get some feedback before handling IPv6.

Only map mbuf when a policy is looked up and indicates that IPSEC needs to transform the packet.
If IPSEC is inline offloaded, it is up to the interface driver to request remap if needed.

Fetch the IP header using m_copydata() instead of using mtod() to select policy/SA.

Change interface of mb_unmapped_to_ext() to distinguish failures between OOM and incompatible packets. Note that right now, TLS packets passed to IPSEC mean insta-panic for debug kernels, and UB for non-debug. This should be fixed by the patch.

Ideas for better diagnostic/action on EINVAL from mb_unmapped_to_ext() are solicited.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Tue, Dec 31, 7:42 AM

I wonder why exactly software ipsec cannot process unmapped mbufs. In an unmapped ktls packet, the protocol headers are still mapped. netipsec needs to access the payload in order to encrypt, but it uses opencrypto's cursor abstraction, and crypto_cursor_* handles unmapped mbufs. What is missing?

Basically, I wonder if we can side-step the problem by making netipsec compatible with unmapped mbufs.

sys/netinet/ip_output.c
690–691

Why not do this within the existing if (IPSEC_ENABLED()) block above?

sys/netipsec/ipsec.c
492–493

ip1 can be a pointer to const.

sys/netipsec/ipsec_output.c
922

This makes me nervous: it's very easy to accidentally invalidate the ip pointer at some point, especially since neither m nor ip is pointing to const here or in callees. Can we avoid fetching the pointer here and defer that to ipsec4_perform_request().

kib marked 2 inline comments as done.EditedTue, Dec 31, 5:20 PM

I wonder why exactly software ipsec cannot process unmapped mbufs. In an unmapped ktls packet, the protocol headers are still mapped. netipsec needs to access the payload in order to encrypt, but it uses opencrypto's cursor abstraction, and crypto_cursor_* handles unmapped mbufs. What is missing?

Basically, I wonder if we can side-step the problem by making netipsec compatible with unmapped mbufs.

Well, it started with PR 272616. There m_unshare uses memcpy() not being adopted for the unmapped buffers.

I think my patch moves the code in the right direction, and is the useful first step. For now, it avoids the cost of re-mapping incurred for all packets just if ipsec.ko is loaded. Also, it:

  • handles incompatibility/panic/UB between ipsec and ktls
  • should improve offloaded ipsec because packets are kept unmapped there.
sys/netipsec/ipsec_output.c
922

I do not share this opinion. The code in the function is straightforward from there to the perform_request() point. There only calculations that are done in between, is assigning the mtag.

I can add an assert that ip is same if recalculated right before the perform_request() call.

Mark one place *ip as const.
Collapse two if()s.

In D48265#1100756, @kib wrote:

I wonder why exactly software ipsec cannot process unmapped mbufs. In an unmapped ktls packet, the protocol headers are still mapped. netipsec needs to access the payload in order to encrypt, but it uses opencrypto's cursor abstraction, and crypto_cursor_* handles unmapped mbufs. What is missing?

Basically, I wonder if we can side-step the problem by making netipsec compatible with unmapped mbufs.

Well, it started with PR 272616. There m_unshare uses memcpy() not being adopted for the unmapped buffers.

I think my patch moves the code in the right direction, and is the useful first step. For now, it avoids the cost of re-mapping incurred for all packets just if ipsec.ko is loaded. Also, it:

  • handles incompatibility/panic/UB between ipsec and ktls
  • should improve offloaded ipsec because packets are kept unmapped there.

Sorry, I didn't quite understand the patch the first time. I think the approach is ok. Soon I might need to make sw KTLS and IPSec compatible with each other for an upcoming project, but I'm not sure yet. I do not see how hw KTLS can be made to work with IPsec at all.

sys/kern/kern_mbuf.c
953

The failure semantics are inconsistent, since in this case we do not free m. I believe mb_unmapped_to_ext() will leak an mbuf in this case.

1099–1102

Suppose _mb_unmapped_to_ext() fails with ENOBUFS on the first mbuf in the chain. It will free m, and top == m, so here we'll free the mbuf twice.

In practice I suspect this never happens because the first mbuf will be mapped.

sys/netipsec/ipsec_output.c
922

The problem is not only in this function. ipsec4_perform_request() passes &m to HHOOK_TYPE_IPSEC_OUT hooks, which may replace the mbuf. This could invalidate the IP header pointer, which is dereferenced later in the sctp_delayed_checksum() call.

kib marked 4 inline comments as done.Wed, Jan 1, 11:46 AM
kib added inline comments.
sys/kern/kern_mbuf.c
1099–1102

Isn't the m_freem(next) call also problematic? I replaced m_freem(top) with conditional m_free(top), to keep next valid always.

sys/netipsec/ipsec_output.c
922

Then this is the place to re-evaluate ip.

kib marked 2 inline comments as done.

Bug fixes:

  • try to handle mbuf leaks and double-free on map failure
  • re-calculate *ip after hooks

I do not see how hw KTLS can be made to work with IPsec at all.

Forgot to answer this. FWIK, nvidia hw does not support more then one crypto op over the packet. In other words, netcard KTLS + inline IPSEC offload cannot coexist.

This is the place where my patch stops, but still a better diagnostic could be useful.

Fetch the IP header using m_copydata() instead of using mtod() to select policy/SA.

This is just to handle the possibility that the IP header is unmapped, right? If so, I am still somewhat skeptical about it: IPSEC_OUTPUT/IPSEC_FORWARD is called by functions which access the protocol headers directly with mtod(). In particular, m_copydata() is more expensive than an inline memcpy with constant size.

In D48265#1101142, @kib wrote:

I do not see how hw KTLS can be made to work with IPsec at all.

Forgot to answer this. FWIK, nvidia hw does not support more then one crypto op over the packet. In other words, netcard KTLS + inline IPSEC offload cannot coexist.

This is the place where my patch stops, but still a better diagnostic could be useful.

If we can support sw KTLS with IPSec enabled, so there is no need to map mbufs, then perhaps we can simply disallow NIC KTLS session creation if the NIC currently has IPSec offload enabled. That is, rather than checking each packet in the data path, we just do not permit enabling both types of offload in a single ifnet.

sys/netinet6/ip6_output.c
1122

Certainly we have to at least return an error here, as m is now a pointer to freed memory.

sys/netipsec/ipsec_output.c
114–115

ip can be a pointer to const.

sys/netipsec/subr_ipsec.c
64–65

ip1 can be a pointer to const.

kib marked 3 inline comments as done.

More const ip1 *.
More love to ipv6 case.

markj added inline comments.
sys/netinet/ip_output.c
737

You might need to rate-limit this print. Or perhaps not: the first error will put the socket into an error state and prevent further transmits, so we would get one print per connection.

This revision is now accepted and ready to land.Fri, Jan 3, 2:43 PM