Certain socket-based workloads in the system may require significantly
more socket buffers than the rest of the sockets. To accomodate these
workloads and avoid bumping the system-wide limit, weaking the system
protection, allow an easy override. A special socket option,
SO_RCVBUFFORCE, requiring newly-added PRIV_NETINET_RCVBUFFORCE priviledge, sets the maximum
socket buffer size, ignoring sb_max (kern.ipc.maxsockbuf) value.
This option exists on Linux since 2.6.18.
The primary use case (for us) are netlink sockets, which can receive large dumps from kernel. Current full-view IPv4 dump for a single fib is about ~40 megabytes, but it can notable increase with wide-multi path is set. Large-scale customers set the buffer to ~128M.
Questions:
- Should there be an absolute maximum on the buffer size (e.g. another sysctl set to, say, 1 gigabytes or an number correlated to the total amount of mbufs?