vfs: drop one vnode list lock trip during vnlru free recycle
vnlru_free_impl would take the lock prior to returning even though most
frequent caller does not need it.
Unsurprisingly vnode_list mtx is the primary bottleneck when recycling
and avoiding the useless lock trip helps.
Setting maxvnodes to 400000 and running 20 parallel finds each with a
dedicated directory tree of 1 million vnodes in total:
before: 4.50s user 1225.71s system 1979% cpu 1:02.14 total
after: 4.20s user 806.23s system 1973% cpu 41.059 total
That's 34% reduction in total real time.
With this the block *remains* the primary bottleneck when running on
ZFS.