Various subsystems pre-allocate a set of pbufs, allocated to implement
I/O operations. pbuf allocations are transient, unlike most buf
allocations.
Most subsystems preallocate nswbuf or nswbuf/2 pbufs each. The
preallocation ensures that pbuf allocation will succeed in low memory
conditions, which might help avoid deadlocks. Currently we initialize
nswbuf = min(nbuf / 4, 256).
nbuf/4 > 256 on anything but the smallest systems. For example,
nswbuf is 256 in a VM with 128MB of memory. In this configuration, a
firecracker VM with one CPU preallocates over 900 pbufs. This consumes
2MB of RAM and adds several milliseconds to the kernel's (very small)
boot time.
Scale nswbuf by ncpu in the common case. I think this makes more sense
than scaling by the amount of RAM, since pbuf allocations are transient
and aren't used for caching. With the change, we get nswbuf=256 with 8
CPUs. For larger systems, the larger size may help, e.g., async
sendfile consumes pbufs from the vnode_pbuf_zone and a large limit is
desirable for a CDN workload. On smaller systems this reduces the
amount of wasted memory and marginally improves boot times, though the
latter is really mostly relevant for micro-VMs.
Reported by: cperciva