Page MenuHomeFreeBSD

TCP: Misc cleanups of tcp_subr.c
ClosedPublic

Authored by rrs on Apr 11 2023, 1:32 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Nov 8, 10:03 PM
Unknown Object (File)
Mon, Nov 4, 11:20 PM
Unknown Object (File)
Fri, Nov 1, 3:50 AM
Unknown Object (File)
Sun, Oct 27, 7:10 AM
Unknown Object (File)
Oct 2 2024, 10:16 PM
Unknown Object (File)
Sep 24 2024, 11:00 PM
Unknown Object (File)
Sep 23 2024, 4:54 PM
Unknown Object (File)
Sep 23 2024, 8:25 AM

Details

Summary

In going through all the TCP stacks I have found we have a few little bugs and niggles that need to
be cleaned up in tcp_subr.c including the following:

a) Set tcp_restoral_thresh to 450 (45%) not 550. This is a better proven value in my testing.
b) Lets track when we try to do pacing but fail via a counter for connections that do pace.
c) If a switch away from the default stack occurs and it fails we

need to make sure the time scale is in the right mode (just in case
the other stack changed it but then failed).

d) Use the TP_RXTCUR() macro when starting the TT_REXMT timer.
e) When we end a default flow lets log that in BBlogs as well as cleanup any t_acktime (disable).
f) When responding in V6 if a traffic class is set we should apply that tclass to the response.
g) When we respond with a RST lets make sure to update the log_end_status properly.
h) When starting a new pcb lets assure that all LRO features are off.
i) When discarding a connection lets make sure that any t_in_pkt's that might be there are freed properly.

Test Plan

These all can be tested with various switching back and forth between rack/bbr and the default
stack. Also we can arrange to discard a tcb with packets left un-processed to make sure they do
get cleaned up. We can also set a IPV6 traffic class and make sure when we respond that we
do include that class.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

rrs requested review of this revision.Apr 11 2023, 1:32 PM
rrs added reviewers: glebius, gallatin.
rscheff added inline comments.
sys/netinet/tcp_subr.c
1709

this appears to be the inverse of IPV6_DCSP - suggest to maybe extract only the DSCP above (using the IPV6_DSCP macro, and storing this in a "dscp" var instead of a tclass var.

And then setting this like what ip_carp is doing:

ip6->ip6_flow = (ip6->ip6_flow & ~IPV6_FLOWINFO_MASK) | (inp->inp_flow & IPV6_FLOWLABEL_MASK);
:
if (inp->in6p_outputopts != NULL)

dscp = IPV6_DSCP(inp->in6p_outputopts->ip6po_tclass)

ip6->ip6_flow |= htonl(dscp << (IPV6_FLOWLABEL_LEN + IPTOS_DSCP_OFFSET));

sys/netinet/tcp_subr.c
1709

Isn't the traffic class set in ip6_output()?

sys/netinet/tcp_subr.c
1709

This particular change was actually done by Drew Gallatin IIRR. Basically at NF we use a specific label when
we send things into the network so that we can catch content incorrectly being routed through the internal
netflix connectivity network. The tcp_respond() function was found to not set the traffic class correctly.

And to Michael, well if ip6_output() set it, it was *not* setting it for V6 when we went through the respond function. I can
take this out and leave it in our base only but if anyone cares about this they will find its broken...

Not questioning that change per se. But what you are functionally doing is not extracting the full traffic class (tclass), but extract the DSCP bits, clearing the ECN bits and sending traffic later using the same DSCP codepoint back.

My suggested code would perform the same functionality, but makes it clear that all it really does is care about the DSCP codepoint (if one is set) and retainting the other ipv6 flow label intact.

(ECN bits should always be cleared out in the flow ID field, as they were).

sys/netinet/tcp_subr.c
1709

Let me go change it to that..I don't see why that won't work and like you say makes sure
we don't inadvertently twiddle the ECN bits.

As I poke in the code the ip6->ip6_flow = (....) line you have at the beginning
is already here (right above the ip6)vfc set.. or am I missing something?

Address Richard and Michaels comments by letting sleeping dogs lie :)

rrs marked 2 inline comments as done.Apr 12 2023, 5:42 PM
rrs added inline comments.
sys/netinet/tcp_subr.c
1709

Ok Richard your suggested fix is broken in a couple of ways:

a) IPV6_DSCP() only works on an ip6_flow pointer
b) It basically does a (( x >> 20) & 0xfc) to the int.
c) your suggested htonl( dscp << (IPV6_FLOWLABEL_LEN + IPTOS_DSCP_OFFSET) works out to

     htonl(dscp << (20 + 2))
which I believe will be broken since you did not slide it down 22 but only 20.

So I have decided I will let the FreeBSD version continue to be like this. We will keep our
diff's that make it work :)

sys/netinet/tcp_subr.c
1709

So, the IPV6_DSCP macro is extracting the tclass masked to the dscp... Sorry for not catching that.

Still, the following seems to be cleaner to me (and probably fewer ops):

+              int32_t tclass = 0;

		ip6 = (struct ip6_hdr *)ip_ptr;
-		ip6->ip6_flow = (ip6->ip6_flow & ~IPV6_FLOWINFO_MASK) |
-			(inp->inp_flow & IPV6_FLOWINFO_MASK);
+              ip6->ip6_flow = (ip6->ip6_flow & ~IPV6_FLOWINFO_MASK) | 
+                      (inp->inp_flow & IPV6_FLOWLABEL_MASK);
		ip6->ip6_vfc = (ip6->ip6_vfc & ~IPV6_VERSION_MASK) |
			(IPV6_VERSION & IPV6_VERSION_MASK);
		if (inp->in6p_outputopts != NULL)
-			tclass = inp->in6p_outputopts->ip6po_tclass;
+			tclass = (inp->in6p_outputopts->ip6po_tclass & ~IPTOS_ECN_MASK);
-		else
-			tclass = 0;
-		ip6->ip6_flow &= htonl((~0xfc) << 20);
-		ip6->ip6_flow |= htonl((tclass & 0xfc) << 20);
+             ip6->ip6_flow |= htonl(tclass << IPV6_FLOWLABEL_LEN);
sys/netinet/tcp_subr.c
1709

The code looks OK to me. It clears the flowlabel and the DSCP and then sets the DSCP.
It looks like code duplication, but that might be necessary. Do you know which problem this patch fixes? Then we can look at it and decide this is the perfect patch or improve it later.

Would be nice to know what the problem is you are experiencing with the DSCP for IPv6.

This revision is now accepted and ready to land.Apr 12 2023, 6:37 PM

Would be nice to know what the problem is you are experiencing with the DSCP for IPv6.

If I remember correctly it was *not* being set. I will ask Drew on Monday and see if he remembers... it is his fix ;)

This revision was automatically updated to reflect the committed changes.