The hardware's allowed to reorder these loads in wg_deliver_out() and
wg_deliver_in() such that we end up with a garbage mbuf that we try to
pass on without appropriate load-synchronization to pair with store
barriers in wg_encrypt() and wg_decrypt((). The issue is particularly
prevalent with the weaker memory models of !x86 platforms.
With the patch, my dual-iperf3 reproducer is dramatically more stable
than it is without.
PR: 264115