When we're done processing a batch of pfsync_msg we can immediately
proceed to pfsyncintr(), rather than schedule a swi (possibly several
times) during the processing. This reduces the swi scheduling overhead,
and also reduces contention on PFSYNC_LOCK.
Replace pfsync_push() calls in the pfsync_msg_intr() path by simply
setting the PFSYNCF_PUSH flag, because we'll always call pfsyncintr()
later. There's no need to schedule the swi.
In pfsync_q_ins() we don't need to schedule the swi either, because it's
always called through pfsync_msg_intr(), so pfsyncint() is guaranteed to
run.
In all other cases (like when we request an update in response to a
received update) we still schedule the swi. Those flows are unmodified.
Tests show a ~25% improvement in throughput.
Sponsored by: Orange Business Services