vrele(): Never block trying to acquire an exclusive vnode lock
Needs ReviewPublic
Actions

Authored by olce on Jul 3 2023, 12:53 PM.

Details

Reviewers

jah
mjg
kib
markj

Summary

Doing so can cause a LOR in code releasing the last active reference on an
unlocked vnode while holding a lock to one of its children.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Skipped

Unit

Tests Skipped

Build Status

Buildable 52399
Build 49290: arc lint + arc unit

Event Timeline

olce created this revision.Jul 3 2023, 12:53 PM

Herald added a subscriber: imp. · View Herald TranscriptJul 3 2023, 12:53 PM

olce requested review of this revision.Jul 3 2023, 12:53 PM

Harbormaster completed remote builds in B52399: Diff 124119.Jul 3 2023, 12:53 PM

olce added a parent revision: D40848: VFS lookup: compute_lk_cnflags(): Remove unused argument 'cnflags'; Rename.Jul 3 2023, 1:03 PM

olce added a child revision: D40850: VFS lookup: New vn_cross_single_mount() and vn_cross_mounts().

Could you please describe the LOR in more detail, if only for my benefit? In particular, in what situation do we try to release a use ref while holding a child locked, and where do we try to lock a vnode while holding a child vnode locked?

sys/kern/vfs_subr.c
3296	Per style(9) the `/` and `/` delimiters should be on their own lines.

vrele() can lock the vnode, this is known and expected behavior. I do not see a reason to try to change it, esp. because it makes the deferred inactivation more frequent.

That said, for patch to be closer to complete, there is at least one more place where lock is taken in sleepable manner.

First a general note that there is one major deficiency in lock assertions: there is no mechanism right now to let funcs denote they optionally take a lock, so many potential problems require actually running into them. For example all vput_final entry points should assert they can take the lock (if not already held) regardless of refcount state. Should there be spots which guarantee count > 1, a dedicated routine can be added to be called instead. The assert thing in general, not just for vnodes, is a problem which I'm going to sort out in the foreseeable future as it is useful at Netgate.

That said, the expected vput/vrele/whatever call order retains the global locking order of parent -> child, making the patch harmful for the normal case. If there is a case which does not, it is most likely a bug. If there is a case which does not and it is not fixable, a dedicated variant can be added which only trylocks like in the proposal above.

Sorry for the noise, I should not have uploaded this differential. I initially thought there were possible LORs in the later introduced vn_cross_mounts() and co. because the code is structured so as to obtain the locked root vnode while still holding an active reference on the covered vnode. *But*, since the point of the new protocol is to ensure that the covered and root vnodes (both sides of the mount) are never locked together, no LOR is actually possible.

But while we are here, I have a few questions if you don't mind.

In D40849#930256, @kib wrote:

vrele() can lock the vnode, this is known and expected behavior. I do not see a reason to try to change it, esp. because it makes the deferred inactivation more frequent.

That said, for patch to be closer to complete, there is at least one more place where lock is taken in sleepable manner.

Indeed, I had forgotten want_unlock and the loop around vinactive() below due to the ERELOOKUP case (I've not yet fully analyzed the reason for it).

That said, why what is good for vput() and even vunref() wouldn't be good for vrele()? And why, in the VUNREF case, ERELOOKUP is simply ignored instead of the vnode being put on the lazy list for the inactivation to be retried?

Thanks.

olce removed a parent revision: D40848: VFS lookup: compute_lk_cnflags(): Remove unused argument 'cnflags'; Rename.Jul 7 2023, 6:55 AM

olce removed a child revision: D40850: VFS lookup: New vn_cross_single_mount() and vn_cross_mounts().