Page MenuHomeFreeBSD

unix/dgram: smart socket buffers for one-to-many sockets
ClosedPublic

Authored by glebius on May 23 2022, 9:02 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 17, 1:01 AM
Unknown Object (File)
Thu, Jan 16, 10:23 PM
Unknown Object (File)
Nov 23 2024, 8:36 PM
Unknown Object (File)
Nov 23 2024, 2:13 PM
Unknown Object (File)
Nov 23 2024, 9:55 AM
Unknown Object (File)
Nov 8 2024, 12:46 PM
Unknown Object (File)
Nov 8 2024, 3:09 AM
Unknown Object (File)
Nov 5 2024, 9:28 PM

Details

Summary

A one-to-many unix/dgram socket is a socket that has been bound
with bind(2) and can get multiple connections. A typical example
is /var/run/log bound by syslogd(8) and receiving multiple
connections from libc syslog(3) API. Until now all of these
connections shared the same receive socket buffer of the bound
socket. This made the socket vulnerable to overflow attack.
See 240d5a9b1ce for a historical attempt to workaround the problem.

This commit creates a per-connection socket buffer for every single
connected socket and eliminates the problem. The new behavior will
optimize seldom writers over frequent writers. See added test case
scenarios and code comments for more detailed description of the
new behavior.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 45890
Build 42778: arc lint + arc unit

Event Timeline

This patch is the final patch in the stack that starts at D35293 and this one achieves the final goal. The intermediate patches pass all tests, but should be considered as intermediate patches during review. The later changes to libc in D35305 depend on kernel functionality achieved by this patch. The whole branch can be viewed at https://github.com/glebius/FreeBSD/commits/unix-dgram

Other important note. This work seems to be dedicated to unix/dgram only, but it actually opens opportunities to other protocols. After this change, the only generic datagram socket left would be UDP, which means that UDP can make sosend_dgram()/soreceive_dgram() its private methods and optimize them. After that step the classic sockbuf would support only stream data, and then sosend_generic()/soreceive_generic() can be optimized to support only stream data. The SCTP which already does its private implementation of socket buffers as a hack can also use now generic feature.

Maintain overall sb_acc + sb_ccc on the receive socket buffer so that
generic socket event code works.

This could be a temporary thing, and eventually this code shall be
protocol specific, too. But let's make the very first protocol specific
socket buffer implementation less disruptive.

I suspect the feature should be documented in unix.4 somehow. Users writing application software regularly get confused by PF_UNIX socket buffer limits, etc., so perhaps we should add a new section which tries to describe the precise behaviour of each type of socket with respect to socket buffer limits and sizing.

sys/kern/uipc_usrreq.c
537

The assignment sendspace = unpdg_recvspace looks incorrect, so I'd suggest adding a comment explaining why it's correct.

1148

Add some assertions to check for underflow?

1342

The term "temporary connected" isn't actually used anywhere.

Consider adding a boolean flag variable instead of using addr == NULL to test for a connected sender.

1345
1348
1349
1355

I think some note on socket buffer locking is useful here. In particular, a connected sender's send buffer is synchronized by the receiver's receive buffer mutex, not by the sender's send buffer mutex.

1373

There are several places where uxdg_{cc,ctl,mbcnt} are adjusted by some deltas, IMHO it'd be cleaner to introduce a subroutine for that which can check for overflow etc.

2203
2205
glebius marked 5 inline comments as done.
  • Comment adds and fixes.
  • Couple assertions.
  • update unix(4)
sys/kern/uipc_usrreq.c
1342

The term isn't used, but it is 100% truth :) Through the lifetime of the actual sending part of this function both sockets are connected, and if you take a snapshot of memory, you won't be able to tell truly connected sockets with a sockets that are now in uipc_sosend_dgram(). This is also the reason why there is no bool connected = (addr != NULL), because through the function course they always are connect. I used to have this bool in earlier version, and then wiped it out.

I will try to improve wording in this comment.

1355

There is such comments in the structure definition in sockbuf.h

1373

There are several places where uxdg_{cc,ctl,mbcnt} are adjusted by some deltas, IMHO it'd be cleaner to introduce a subroutine for that which can check for overflow etc.

Well, generic socket code doesn't have this.

sys/sys/sockbuf.h
142

Typo

Fix leak in unp_disconnect(). Rewrite the added code to be more clear.

share/man/man4/unix.4
399
411

"... sockets of type SOCK_DGRAM are unreliable ..."

414
418

I think some further explanation of socket buffer accounting would be useful. My understanding of the implementation is,

  1. for unconnected sends, the receiving socket's receive buffer limit applies,
  2. for connected sends, the sending socket's send buffer limit applies.

The aim is to provide some initial guidance for a programmer confused by ENOBUFS from a send() call.

I'd be ok adding this as a separate commit.

421

Or "An apparent," but I think that word does not add anything here.

422
423
sys/kern/uipc_usrreq.c
1342

Fair enough, thank you.

This revision is now accepted and ready to land.Jun 9 2022, 6:54 PM
  • More manual page text.
  • Small bugfix.
This revision now requires review to proceed.Jun 9 2022, 7:59 PM
pauamma_gundo.com added inline comments.
share/man/man4/unix.4
397
419
437

Do you mean "will" here? "Would", to me, has undertones of "if something else didn't prevent that".

446

"apparent"? Did you mean "possible"? (See also markj's comment above.)

This revision now requires changes to proceed.Jun 9 2022, 11:06 PM

Accept all manual page suggections. Thanks!

  • Remove more legacy from uipc_dgram_sbspace() making it a true bool. Fixes edge case found by syzcaller.

Manual page English LGTM. Can't say about code or consistency between them.

This revision is now accepted and ready to land.Jun 14 2022, 1:08 AM