Page MenuHomeFreeBSD

sysctl interface to set socket options for TCP endpoints
ClosedPublic

Authored by tuexen on Feb 2 2022, 10:48 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 9, 8:02 PM
Unknown Object (File)
Mon, Nov 4, 11:32 PM
Unknown Object (File)
Thu, Oct 17, 12:20 AM
Unknown Object (File)
Oct 9 2024, 4:21 AM
Unknown Object (File)
Oct 7 2024, 10:28 PM
Unknown Object (File)
Oct 6 2024, 12:48 AM
Unknown Object (File)
Sep 26 2024, 4:46 AM
Unknown Object (File)
Sep 25 2024, 6:45 PM
Subscribers

Details

Summary

This sysctl interface to set a socket option will be used by an upcoming tool tcpsso, which will allow to set a socket option via the command line.

This depends on some upcoming changes to tcp_ctloutput_set(), which allows it to be sent from a context different from setsockopt(). It is under review in D34164.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

There may be an optimization - if that sysctl is being called multiple times, intially with a buflen 0, then with a suitable allocated buffer, you may want to simply return the buffer size and return early. (although as this sysctl expects a binary blob, sysctl will not be a tool suitable to interface with this).

I see that you can set socket options wildcard style across all currently existing sockets.

Also, I'm not sure when the locking comes into play with these macros - is the INP only locked during that final iteration, or from the initialization of inpi (if at all)?

The fact that you needed to include in_pcb_var.h really hints that the new function belongs to in_pcb.c rather than tcp_subr.c! The only TCP specific part is that you call tcp_ctlooutput_set(). What if we create a new field in protocol methods - pr_ctloutput_set(), that would accept already locked inp. You don't need to implement it for UDP, just leave possibility to easy add this later.

This depends on some upcoming changes to tcp_ctloutput_set(), which allows it to be sent from a context different from setsockopt().

Do you have this already published?

This depends on some upcoming changes to tcp_ctloutput_set(), which allows it to be sent from a context different from setsockopt().

Do you have this already published?

Not yet. It will come in two steps: the first one is a cleanup with no functional change (changing function arguments as we discussed and moving locking out of the set/get functions). Will put that up later today and add you as a reviewer. The second step will deal with the case that the set function might get called not via the socket layer. Will put that up once the first is in tree.

The fact that you needed to include in_pcb_var.h really hints that the new function belongs to in_pcb.c rather than tcp_subr.c! The only TCP specific part is that you call tcp_ctlooutput_set(). What if we create a new field in protocol methods - pr_ctloutput_set(), that would accept already locked inp. You don't need to implement it for UDP, just leave possibility to easy add this later.

Sure, that makes sense. However, the code also refers to V_tcbinfo and adds the sysctl field under net.inet.tcp. Are you suggesting to add the field under net.inet?

Ensure the socket does not go away if the inp gets unlocked.

Hold a refcount on the socket, if possible.

There may be an optimization - if that sysctl is being called multiple times, intially with a buflen 0, then with a suitable allocated buffer, you may want to simply return the buffer size and return early. (although as this sysctl expects a binary blob, sysctl will not be a tool suitable to interface with this).

You call this sysctl and provide the data needed to apply the socket option. This is a single call for a single end point.

I see that you can set socket options wildcard style across all currently existing sockets.

tcpsso can perform such a loop, but not the code in the kernel.

Also, I'm not sure when the locking comes into play with these macros - is the INP only locked during that final iteration, or from the initialization of inpi (if at all)?

The inp is locked in the body of the while loop.

I see that you can set socket options wildcard style across all currently existing sockets.

tcpsso can perform such a loop, but not the code in the kernel.

The gencnt / inpi combination unique on a system?

I see that you can set socket options wildcard style across all currently existing sockets.

tcpsso can perform such a loop, but not the code in the kernel.

The gencnt / inpi combination unique on a system?

When you allocate a PCB, it gets a unique (with the protocol) inp_gencnt. See in_pcb.c:660. You can use sockstat -i to get this identifier.

Hold a refcount on the socket, if possible.

Can you please explain why this became necessary?

Hold a refcount on the socket, if possible.

Can you please explain why this became necessary?

Normally tcp_ctloutput_set() is ccalled via the socket layer and therefore the socket can't go away. When calling it not via the socket layer, I want to have the same property. Please note, that the inp wlock is given up sometime during processing, for example when processing a socket option of a level different from IPPROTO_TCP. Are you saying, that the socket can't go away?

This revision is now accepted and ready to land.Feb 7 2022, 7:17 PM

Update for latest change in tcp_ctloutput_set().

This revision now requires review to proceed.Feb 8 2022, 10:14 PM

I talked to rrs@ yesterday and we think that the code for sysctl_setsockopt()
is not useful for SCTP and we don't see a use case for UDP. So from that
perspective, it can stay in tcp_subr.c.

My rationale is not taking care of SCTP or UDP, but keeping generic PCB code isolated from protocol specific code. Just like back in the past I asked for TLS being a SOL_SOCKET level option, not because I really cared about TLS over local socket.

Indeed creating a pr_ctloutput_set that would be implemented just for TCP is an overkill. But we should leave space for that. If we move the function from tcp_subr.c to in_pcb.c and don't expose in_pcb_var.h to TCP, then we need to expose a few bits of TCP to generic PCB code, namely:

extern struct inpcbinfo V_tcbinfo;
extern int tcp_ctloutput_set();

IMHO, this would be lesser evil than exposing generic bits to protocol specific code. And in future this can be easily translated into pr_ctloutput_set for massive setsockopts on other protocols.

Actually, tcpbinfo is already seen by in_pcb.c via tcp_var.h, so only tcp_ctloutput_set() needs to be exposed.

Gleb, does this version make you happy? Or do you have something else in mind? I find passing around the pcbinfo pointer and the function pointer no so elegant, but I have not found a better way. The only option I considered is to use IPPROTO_TCP as arg2 in the sysctl parameters and deduce from that the pcbinfo and function pointer. But then I have a problem with the INP_ALL_ITERATOR macro.

This is even better than what I was trying to describe. A middle ground between making a new protocol method and a hack with exporting single TCP function to inpcb code. Thanks a lot! :)

This revision is now accepted and ready to land.Feb 9 2022, 2:23 AM