Page MenuHomeFreeBSD

fusefs: fix VOP_READDIR problems for NFS-exported FUSE file systems
ClosedPublic

Authored by asomers on Jan 3 2022, 12:41 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Jan 8, 9:22 PM
Unknown Object (File)
Thu, Jan 2, 7:36 PM
Unknown Object (File)
Mon, Dec 30, 9:28 PM
Unknown Object (File)
Mon, Dec 30, 8:06 PM
Unknown Object (File)
Sun, Dec 29, 4:10 AM
Unknown Object (File)
Nov 27 2024, 4:30 PM
Unknown Object (File)
Nov 26 2024, 11:46 PM
Unknown Object (File)
Nov 17 2024, 2:25 AM
Subscribers

Details

Summary

Fix NFS exports of FUSE file systems for big directories

The FUSE protocol does not require that a directory entry's d_off field
outlive the lifetime of its directory's file handle. Since the NFS
server must reopen the directory on every VOP_READDIR call, that means
it can't pass uio->uio_offset down to the FUSE server. Instead, it must
read the directory from 0 each time. It may need to issue multiple
FUSE_READDIR operations until it finds the d_off field that it's looking
for. That was the intention behind SVN r348209 and r297887, but a logic
bug prevented subsequent FUSE_READDIR operations from ever being issued,
rendering large directories incompletely browseable.

MFC after: 3 weeks

fusefs: optimize NFS readdir for FUSE_NO_OPENDIR_SUPPORT

In its lowest common denominator, FUSE does not require that a directory
entry's d_off field is valid outside of the lifetime of the directory's
FUSE file handle. But since NFS is stateless, it must reopen the
directory on every call to VOP_READDIR. That means reading the
directory all the way from the first entry. Not only does this create
an O(n^2) condition for large directories, but it can also result in
incorrect behavior if either:

  • The file system _does_ change the d_off field for the last directory entry previously seen by NFS, or
  • The file system deletes the last directory entry previously seen by NFS.

Handily, for file systems that set FUSE_NO_OPENDIR_SUPPORT d_off is
guaranteed to be valid for the lifetime of the directory entry, there is
no need to read the directory from the start.

MFC after: 3 weeks

fusefs: require FUSE_NO_OPENDIR_SUPPORT for NFS exporting

FUSE file systems that do not set FUSE_NO_OPENDIR_SUPPORT do not
guarantee that d_off will be valid after closing and reopening a
directory. That conflicts with NFS's statelessness, that results in
unresolvable bugs when NFS reads large directoryes, if:

  • The file system _does_ change the d_off field for the last directory entry previously returned by VOP_READDIR, or
  • The file system deletes the last directory entry previously seen by NFS.

Rather than doing a poor job of exporting such file systems, it's better
just to refuse.

Even though this is technically a breaking change, 13.0-RELEASE's
NFS-FUSE support was bad enough that an MFC should be allowed.

MFC after: 3 weeks.

Test Plan

Manual testing with bfffs, fuse-ext2, and lklfuse, NFSv3 and NFSv4.2

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 43682
Build 40570: arc lint + arc unit

Event Timeline

  • Prohibit exporting file systems during VOP_VPTOFH, not VOP_MOUNT

@rmacklem will you be able to review this PR? I'd like to get it into FreeBSD 13.1.

Looks ok, if I understood what the patch does.
Basically, instead of reading a directory from the
beginning of it, it simply refuses to export the file
system unless it has the FSESS_NO_OPENDIR_SUPPORT
property, which means the cookies remain valid.

If that is basically what the patch does, it seems fine to me.

This revision is now accepted and ready to land.Feb 2 2022, 9:03 PM

Yep, that's exactly the idea.