Page MenuHomeFreeBSD

unionfs_lookup(): fix wild accesses to vnode private data
ClosedPublic

Authored by jah on Apr 2 2024, 10:44 PM.
Tags
None
Referenced Files
Unknown Object (File)
Apr 26 2024, 4:40 AM
Unknown Object (File)
Apr 25 2024, 5:30 PM
Unknown Object (File)
Apr 25 2024, 9:50 AM
Unknown Object (File)
Apr 12 2024, 7:31 PM
Unknown Object (File)
Apr 11 2024, 7:13 PM
Unknown Object (File)
Apr 10 2024, 4:45 AM
Unknown Object (File)
Apr 10 2024, 12:55 AM

Details

Summary

There are a few spots in which unionfs_lookup() accesses unionfs vnode
private data without holding the corresponding vnode lock or interlock.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jah requested review of this revision.Apr 2 2024, 10:44 PM
This revision is now accepted and ready to land.Apr 3 2024, 1:11 AM

These changes plug holes indeed.

Side note: It's likely that I'll rewrite the whole lookup code at some point, so I'll have to test again for races. The problem with this kind of bugs is that they are triggered only by rare races. We already have stress2 which is great, but also relies on "chance". This makes me think that perhaps we could have some more systematic framework triggering vnode dooming, let's say, at unlock. I'll probably explore that at some point.

These changes plug holes indeed.

Side note: It's likely that I'll rewrite the whole lookup code at some point, so I'll have to test again for races. The problem with this kind of bugs is that they are triggered only by rare races. We already have stress2 which is great, but also relies on "chance". This makes me think that perhaps we could have some more systematic framework triggering vnode dooming, let's say, at unlock. I'll probably explore that at some point.

I've wished for that multiple times as well; something like a "VFSan", although I do wonder if some of this functionality could be integrated with an existing SAN like KMSan.
In a perfect (imaginary) world, the ideal thing to do would be to simplify the VFS so that filesystems don't have to do so much extra work to handle the possibility of a concurrent forced unmount in the first place, but I'm not sure how that would be done. It brings to mind something like SMR or epoch(9) wherein vnode dispatch would enter something like a "VFS epoch" and forcible unmount would wait for the epoch to drain, but the mechanism would have to be nestable, sleepable, and still allow forced unmount to work in a reasonable amount of time, which makes it sound roughly like magic.