Page MenuHomeFreeBSD

fusefs: set d_off during VOP_READDIR
ClosedPublic

Authored by asomers on Feb 12 2021, 1:04 AM.
Tags
None
Referenced Files
F108373993: D28605.diff
Fri, Jan 24, 6:40 AM
F108311169: D28605.id83811.diff
Thu, Jan 23, 6:45 PM
Unknown Object (File)
Wed, Jan 22, 5:22 PM
Unknown Object (File)
Tue, Jan 21, 9:17 AM
Unknown Object (File)
Sun, Jan 5, 8:55 AM
Unknown Object (File)
Sun, Jan 5, 8:27 AM
Unknown Object (File)
Sun, Jan 5, 8:27 AM
Unknown Object (File)
Fri, Jan 3, 6:18 AM
Subscribers

Details

Summary

This allows lseek to be used with lseek to position the file so that
getdirentries(2) will return the next entry. It is no used by readdir(3).

PR: 253411
Reported by: John Millikin <jmillikin@gmail.com>
MFC after: 1 week

Test Plan

test case added

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 36957
Build 33846: arc lint + arc unit

Event Timeline

In the commit summary, is this a typo?

This allows lseek to be used with lseek to...

I don't understand the sentence. Is the 2nd lseek FUSE?

It is no used by readdir

typo "no" -> "not"?

sys/fs/fuse/fuse_internal.c
623–624

bytesavail is a bad name for this. Maybe oreclen for output reclen?

641

So, for FreeBSD, I think d_off is expected to be the offset of the *next* dirent in the logical stream. I.e., d_off = nbytes_already_emitted + bytesavail.

It's a little weird for us because we're translating between a FUSE-formatted stream and a dirent-formatted output. I think normally I'd say offsets and seeks on the stream from FreeBSD should be in terms of the dirent output rather than the fuse_dirent input, but that makes lseek from userspace to FUSE basically impossible (with some manual cache internal to FUSE -- ugly).

I think we should (1) keep a running total of nbytes_already_read = sum(freclen), and (2) validate that fudge->off == nbytes_already_read + freclen (or whatever is appropriate for FUSE semantics -- maybe it should just be nbytes_already_read) to verify the internal consistency of the FUSE dirent stream we're getting from the untrusted server.

662–663

This check should go before the if (*fnd_start != 0) fill clause -- startoff is inclusive.

tests/sys/fs/fusefs/Makefile
74–76

GENERIC_DIRSIZ / DIRLEN is guaranteed to produce 8-byte aligned records, and the struct only requires 8-byte alignment on every arch. It might be difficult to persuade the compiler of that, but you could try adding alignment assertions.

tests/sys/fs/fusefs/readdir.cc
225

Zero is a valid fd. (ASSERT_GT(fd, 0) is more legible to me, but whatever.)

228–230

Maybe validate reclen?

asomers added inline comments.
sys/fs/fuse/fuse_internal.c
623–624

Yeah, I agree.

641

You're making assumptions about what d_off means. What you describe makes sense if the stream is a compact sequence of dirents, as it might be for UFS. But the FUSE server is allowed to interpret d_off however it wishes. I don't think we have any choice but to accept whatever the server tells us. The worst that happens is that the user ends up seeking to an invalid offset. But only the server knows which offsets are invalid.

662–663

No, the existing placement is correct. fudge->off is the seek offset of the _next_ dirent, not the current one.

tests/sys/fs/fusefs/Makefile
74–76

I would rather not. Even if it works, I think it would make the code confusing, and confusing is bad in test cases.

asomers marked 4 inline comments as done.
  • fusefs: set d_off during VOP_READDIR
  • Respond to cem's review comments.
sys/fs/fuse/fuse_internal.c
641

Ok

662–663

I'm not sure what I was thinking, you're right.

tests/sys/fs/fusefs/Makefile
74–76

I suppose it could be done in a confusing way, but I don't think it has to be. Tests document assumptions, including the behavior of primitives we rely on. The -Wno-cast-align is a red flag that suggests this code may only work on x86.

Try it? If you've made a good effort and it cannot be done, that's one thing, but if we haven't tried I don't think it's valid to reject.

tests/sys/fs/fusefs/Makefile
74–76

I can't get it to work by adding assertions. The best I can do is to add an alignment macro, and then assert that the alignment macro did nothing. And that seems silly:

	de1p = &(buf[de0->d_reclen]);
	de1 = (struct dirent*)_ALIGN(de1p);	/* Suppress Wcast-align */
	ASSERT_EQ((char*)de1, de1p);
tests/sys/fs/fusefs/Makefile
74–76

Yeah. What bugs me here is that the compiler clearly knows that buf is aligned, because the similar cast:

de0 = (struct dirent*)&buf[0];

doesn't produce a warning, despite clearly increasing alignment requirement from 1 to 8 as well.

But any sort of assertion on reclen or whatever doesn't seem to result in it grokking that the next record is also aligned. So it's a shortcoming in the warning analysis in Clang. Bleh.

For what it's worth, casting through (void*) defeats the warning, although obviously it doesn't do anything for actually ensuring the alignment.

Feel free to keep the -Wno-cast-align

tests/sys/fs/fusefs/Makefile
74–76

I'm guessing that the compiler doesn't complain about de0 since buf is stack-allocated. Either that ensures 8-byte alignment, or the compiler forces 8-byte alignment because it knows about the cast. I think I'll just keep the -Wno-cast-align then. Is there anything else I need to do to this PR?

cem added inline comments.
tests/sys/fs/fusefs/Makefile
74–76

Yeah, one way or another Clang knows that buf is 8-byte aligned. It fails to propagate that information when adding values that are multiples of 8 to the base pointer somehow.

No, nothing else, looks good to me.

This revision is now accepted and ready to land.Feb 13 2021, 4:44 AM
This revision was automatically updated to reflect the committed changes.