inpcb: retire two-level port hash database
This structure originates from the pre-FreeBSD times when system RAM was
measured in single digits of MB and Internet speeds were measured in Kb.
At first level the database hashes the port value only to calculate index
into array of pointers to lazily allocated headers that hold lists of
inpcbs with the same local port. This design apparently was made to
preserve kernel memory.
In the modern kernel size of the first level of the hash is derived from
maxsockets, which is derived from maxfiles, which in its turn is derived
from amount of physical memory. Then the size of the hash is capped by
IPPORT_MAX, cause it doesn't make any sense to have hash table larger then
the set of possible values. In practice this cap works even on my laptop.
I haven't done precise calculation or experiments, but my guess is that
any system with > 8 Gb of RAM will be autotuned to IPPORT_MAX sized hash.
Apparently, this hash is a degenerate one: it never has more than one
entries in any slot. You can check this with kgdb:
set $i = 0 while ($i <= tcbinfo->ipi_porthashmask) set $p = tcbinfo->ipi_porthashbase[$i].clh_first set $c = 0 while ($p != 0) set $c = $c + 1 set $p = $p->phd_hash.cle_next end if ($c > 1) printf "Slot %u count %u", $i, $c end set $i = $i + 1 end
Retiring the two level hash we remove a lot of complexity at the cost of
only one comparison 'inp->inp_lport != lport' in the lookup cycle, which
is going to be always false on most machines anyway. This comparison
definitely shall be cheaper than extra pointer traversal.
Another positive change to be singled out is that now we no longer need to
allocate memory in non-sleepable context in in_pcbinshash(), so a
potential ENOMEM on connect(2) is removed.
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D49151