vfs: drop one vnode list lock trip during vnlru free recycle
vnlru_free_impl would take the lock prior to returning even though most
frequent caller does not need it.
Unsurprisingly vnode_list mtx is the primary bottleneck when recycling
and avoiding the useless lock trip helps.
Setting maxvnodes to 400000 and running 20 parallel finds each with a
dedicated directory tree of 1 million vnodes in total:
before: 4.50s user 1225.71s system 1979% cpu 1:02.14 total
after: 4.20s user 806.23s system 1973% cpu 41.059 total
That's 34% reduction in total real time.
With this the block *remains* the primary bottleneck when running on
ZFS.
Approved by: re (gjb)
(cherry picked from commit 74be676d87745eb727642f6f8329236c848929d5)
(cherry picked from commit 206dd9d1a82df140a6071545a2dc558e8d9f5ad0)